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ABSTRACT 


This thesis deals with random processes which are stationary, 
ergodic, and described by correlation functions or power density spectra. 
An attempt has been made to develop a new approach to the study and control 
of random processes which is simple, stresses physical rather than mathe- 
matical interpretation, and is valid when a number of statistically related 
processes are to be processed simultaneously. Among the origianl and fun- 
damental results of this investigation are: 


(1) A closed-form solution is presented for the optimum multi-di-~ 
mensional system in the Wiener sense. This system operates on n correlated 
random input signals and produces m desired outputs, each of which has 
minimum mean-square error. The solution is dependent upon the factoriza- 
tion of a matrix Q(s) of the cross-power density spectra of the input signals 
into two matrices, such that @(s) = G(-s). G'(s). The nxn physical system 
G(s) determined from this procedure must be realizable and inverse realizable, 
and is the system which would reproduce the measured statistics when excited 
by n uncorrelated white noise sources, 


(2) A general solution is derived for the above matrix factorization 
problem, valid without regard to order, providing the spectra satisfy a 
realizability requirement. The method employs a series of simple matrix 
transformations which manipulate the original matrix into desired forms. 
The key to this solution is a general procedure for reducing a matrix with 
polynomial elements to impotent form, having a constant determinant. This 
latter step is also an original contribution to the theory of matrices with 
algebraic elements. With this solution to the matrix factorization problem, 
essentially no conceptual difference remains between single and multi-di- 
mensional random processes. 


(3) The optimum single or multi-dimensional prediction operation is 


~ii- 


shown to result from a continuous measurement of the current state 
variables of the hypothetical model G(s) which can create the random pro- 
cess from white noise excitation. These state variables are then weighted 
according to their decay as initial conditions in the desired prediction time 
and the''decayed "output or outputs are the desired prediction. Thus, ex- 
pected behavior of the random process over all future time is compactly 
summarized in the current values of these state variables. 


(4) It is proved that correlation functions measured between two 
variables in a linear system can be viewed as an initial condition response 
of this system. Also, the well-known Wiener-Hopf equation is shown merely 
to require that every error be uncorrelated with past values of every input 
Signal, 


(5) If one or more noisy signals have a power density spectra matrix 
® (s), which can be factored into G(-s) G'(s), and if G(s) is separated 
such that G(s) = S(s) + N(s), where S(s) and N(s) have signal and noise poles, 
respectively, then it is shown that the optimum filter is a unity feedback 
system with forward transference S(s) N (s). This very general result is 
valid for single or multi-dimensional optimum filtering problems, 


(6) A quantitative substitute for the Nyquist sampling theorem is 
presented which is concerned with a measure of the irrecoverable error in- 
herent in representing a continuous random process by its samples. Also, 
the new results in continuous random process theory derived herein are ex- 
tended to the discrete case. 


(7) The concept of ''state" of a random process is advanced as 
fundamental information for control use. Two new design principles are 
discussed for the bang-bang control of a linear system subject to a random 
input. In one, suitable for multi-dimensional full throw control, the deter- 
minate Second Method of Lyapunov is extended to include random processes. 


The basic contributions of this thesis are (1) a complete theory of 
multi-dimensional random processes, (2) a simple physical explanation for 
the optimum linear filter and predictor using white-noise generating models, 


and (3) a new approach to stochastic control problems, especially those 
involving saturation, using the concept of the "state" of a random process. 
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CHAPTER I. 


INTRODUCTION 


The word "random" is an adjective which mankind has come to 
use in apology for unwillingness or inability to measure fundamental 
causes for events observed in Nature. Of these events, the random 
process which goes on continuously and indefinitely has captured the 
interest of mathematicians and engineers. There is something com- 
pelling about attempting to describe that which is ever changing, and 


thus undescribable. 


This thesis is concerned with random processes in their sim- 
plest form -- with statistics that do not change with time, and whose 
properties are adequately described by the well-known correlation 
functions. Many able researchers have cleared this path and it could 
well be asked, like an echo from the Second World War, "Tg this trip 


necessary ?'! 


To begin with, a research investigation is generally based on 
aggravation, either with what is not known or with what is known, In 
this work, the latter case is true. It is the opinion of the author that 
the classic and beautiful core theory of Wiener in this area, by its 
very mathematical eloquence, has tended to suppress a more funda- 
mental understanding of what can be known in a random process and 


what cannot. 


In essence, the original work of this thesis starts with the 
well-known fact that the random processes considered here act as if 
they came from a linear system which is excited by the most random 
of signals, "white" noise. This linear system specifies the particular 
random process, and focussing attention on its determinate structure 
iS amore satisfying approach, at least to the engineer, than is ac- 
cepting the manipulation of statistical properties of the ever-changing 


output of this system. 


some of the unsolved problems and prominent possibilities in 


er ee 





random process theory which come to mind for possible attack are: 


(1) Conventionally, derivations in the Wiener theory are made 
for optimum systems in the time domain. A pure transform approach 
appears much more desirable. 

(2) A general closed-form solution for the optimum multi -di- 
mensional system has not yet been given in the literature. 

(3) A means has not yet been found for determining a physical 
system capable of reproducing signals with the given statistics of mul- 
ti-dimensional random processes. 

(4) The fundamental results of Wiener theory are the optimum 
predictor and filter. It may be possible that these have a very simple 
interpretation in terms of the equivalent white-noise driven system. 

(5) The correlation functions of many observed random pro- 
cesses have the appearance of an initial condition response of a linear 
system. If this is true, what linear system and what initial conditions ? 

(6) What effect would white ndise have if suddenly applied to an 
otherwise quiescent linear system ? 

(7) There is no valid measure of the inherent error due to sam- 
pling of a random process to replace the ''Go-No Go" nature of the 
Nyquist Sampling Theorem. 

(8) If a linear theory produces all the knowable information 
about an input random process, is there some way of intelligently 
using this to control a physical system which has limitations such as 
saturation? No suitable approach to the on-off or bang-bang control 
problem with random excitation has been made which makes complete 
use of this information, 

(9) If a random process is to be examined by means of inves- 
tigation of an effective physical system, can some determinate ap- 
proaches to systems analysis such as the ''Second Method of Lyapunov'' 


be extended to include random processes ? 


This thesis provides a quantitative answer to each of these ques- 


tions or possibilities. The author believes that the results found in this 


aoe 


thesis investigation, because of their simplicity and generality, provide 
the most effective means for understanding the nature of stationary ran- 


dom processes. 





CHAPTER IH. 


DERIVATION OF OPTIMUM 
SINGLE AND MULTIDIMENSIONAL SYSTEMS 


2.1 Introduction 


This chapter is concerned with linear systems which operate on 
stationary random processes so as to minimize a quadratic measure of 
error between the desired and actual outputs. In the case of a single ran- 
dom signal, perhaps corrupted by noise, the results of this theory have 
been known for over a decade. Why, then, is it necessary to retrace 


such well-worn steps ? 


There are two reasons for this apparent duplication. First of 
all, the author feels that the time-domain derivations found in many 
standard texts of the optimum Wiener filter are unnecessarily compli- 
cated and tend to obscure the basic simplicity of the ideas expressed. 
secondly and more important, when the optimum system to process two 
or more signals simultaneously is derived, the conventional methods 
rapidly become enmeshed in their own symbology, whereas the steps of 
the single-signal frequency domain approach to be described in this 


chapter allow direct extension to the multi-dimensional case. 


2.2 Historical perspective 


In this country, the origin of the statistical theory of optimum 
linear systems was the wartime work of Wiener: A parallel develop- 
ment in Russia at approximately the same time was made by Kolmaeeeora 
The structure of the basic theory was thus well-formed by 1950 for prob- 
lems involving prediction and filtering of a single stationary random pro- 
cess in the presence of additive noise. Significant extensions and clari- 
fication of Wiener's work were made by Zadeh and Ragazcini’, Bode 
and $narnon, Siam wee Pee and Newton The latter's work 
was of particular significance, since it introduced the concept of optimi- 


zation with constraints in order to satisfy certain practical engineering 
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requirements of a system which the basic theory neglected. In the last 
decade, graduate-level control systems engineering texts have gener- 

9 
ally emphasized the statistical approach. These include books by Truxal , 


ies 
Mewione Sanit Seifert and Siseuuee and Lanning and Battin . 


In the multi-dimensional case, the theory is not as well-developed. 
mestcott derived an optimum configuration for the two-dimensional 
case. ecu used a partial matrix approach and successfully derived 
the optimum unrealizable configuration, but his realizable solution was 
only applicable upon very restricted signal conditions. Hsieh and Teeondesme 
presented a method for solving for the optimum system involving unde- 
termined coefficients, but the meaning of their solution was obscured 
by the formidable notation employed and no proof of the adequacy of 


their method was offered. 


2.3 Summary of linear statistical theory 


Figure 2.1 shows a typical time record of a random process in- 
volving two variables, x and y. The signals to be considered under this 
theory are stationary; that is, they have statistical properties which do 
not change with time. Also, these statistical properties can be approx- 
imated by measurements made on a single long but finite time-recording 
of the particular continuous signal -- that is, the processes satisfy the 


ergodic hypothesis. 


SAT \ Faw ee ee 
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Figure 2.1 Typical random processes 


The objective of statistical analysis of a random process is to 
detect cause-effect relationships between events -- or signal levels -- 
Separated in time. The basic tools in this analysis are the auto-correla- 
tion and the cross-correlation functions. The auto-correlation function, 


ae ), is defined as the average value of the product of the instantane- 
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ous signal and the signal level T seconds later. 
Yo?) fk {xt " x(t +7) } (248) 


where the symbol = is a defining equality and the operator {+} means 
"the expected value of". Expressed in integral form for the class of 
signals considered, 
lime 4 2 
ae = myed Br f dt x(t) ° x(t +T ) (2, 2) 
°-T 
Figure 2, 2 shows a typical auto-correlation function. Note that 
it is even about the7 = 0 axis, Cem) iio -T~), since replacing t by 
t -T in Equation 2.2 does not affect its value. The maximum value of 
ae is at T= 0 for any stationary signal observed in the real world 
(a proof is given by Truxal’, ) 
The cross-correlation function, ots ), is defined as the average 
value of the product of the instantaneous signal level of one variable, x, 


and that of another signal, y,7 seconds later. 


P gi) 265 fu " y(t ry} (2, 3) 
PiclT) = qyeo aR | at xt)’ yt +7) (2. 4) 
“T 
Peat) 


7 
} 
Figure 2,2 Atypical auto-correlation function 


In this case, replacing t by t -7T in the integral form yields the 
definition of p (-T ), and the peak value of sd (7 ) does not necessar- 
yx “oF 


ily occur at the origin, Summarizing, 





P (-r) 


XX 


Pyl-T) 


Lae (015) 


aa (2. 6) 


The auto-correlation functions and all possible cross-correla- 
tion functions among members of a set of random signals completely 


describe the particular process for the purposes of a linear theory. 


One significant use of the auto-correlation function is that YP (0) 
is, by definition, the mean square value of x. For example, this makes 
it a useful measure of the accuracy of a system when the signal concerned 


is the error. 


Since the correlation functions (for 77 0) have the same appear- 
ance as transient signals observed in linear systems it is logical to de- 
fine the Laplace transforms of these functions and inquire as to their 
potential use. As the functions are defined for both positive and nega- 
tive 7 , the bilateral or "two-sided" Laplace transform is selected for 
use. The bilateral Laplace transform evaluates the positive-time part 
of a signal just as the one-sided Laplace transform does, but the nega- 
tive-time portion has the sign of t changed (ie: "flipped over" thet = 
0 axis), evaluated as a positive-time signal, and the sign of s, the trans- 


form variable, is changed to -s. 


In order to ensure a one-to-one correspondence between the trans- 
form and the time-domain expression, it is necessary to specify that all 
poles in the right half plane (or "negative" poles) correspond to functions 


in negative time and not unstable functions in positive time. 


In this work, the bilateral Laplace transform of the auto and 
cross-correlation function is defined as the auto or cross power density 
spectrum, @_ As) or @,,(s), respectively. The notion of power density 


arises in the following fashion: 


The mean square value of a random signal x is envisioned as a 
generalized form of average energy because of its quadratic nature, and 


is equal by definition to « P60) Patt YO 6) is finite, it is equal to the 


Bs 








sum of the residues of either the left- half or right-half plane poles of 
the transform G.. (s), as seen directly from a partial fraction expan- 
sion of ® (s) and term-by-term inversion, But by the residue theorem 
of complex variable theory, the evaluation of a closed contour up the 
imaginary axis of the s-plane and enclosing the left-half-plane at infin- 
ity will yield 2zj x summation of residues, providing the contour is of 


the order of no less than = aS S—»cd. That is, @ is) must contain 


2 
Ss 
at least two more poles than zeros for a finite mean square value, ) 60): 
to exist. 
Thus, jeo 
1 
P= 5 fds Oss) (2.7) 
-j00 
Let s = jw 
a0 
1 
P60) = [o D ) (2. 8) 


ond 


The mean square value (or power) of a signal is thus seen to be 
proportional to the integral of @ lo) over allw, and Q 4b?) quite natu- 
rally is visualized as a power density per unit w. Most authors have in- 
cluded the = in the definition of the power density spectrum so that 
the integral over all w yields the total average power, but this appears 
to be less natural than retaining the pure transform relationship, espe- 
cially since the name "power" is a misnomer in itself. The w notation 
is the most common encountered in past literature on random processes, 
and brings to mind a weighting of harmonic content, considering the ran- 
dom process to be a superposition of an infinite number of infinitely small 


simusoidal waves. 


It might be argued that the choice of nomenclature is a trivial 
matter, but in as much as it influences basic conceptualization of a ran- 


dom process, it is very important and deserves elaboration. 


Ten years ago in automatic control literature, the transfer function 
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of a linear system was invariably written as G(jw), and much was made 
of plots of frequency response on polar or logarithmic coordinates, Fre- 
quency response was almost regarded as an end in itself, and design 
specification in terms of these characteristics helped propagate this 
belief. However, the acceptance of Evan's root-locus Beaiod: | and the 
strong emphasis by Trees and others towards use of the Laplace 
transform helped unify the differential equation, frequency response, 

and transient response approaches to dynamic behavior of linear systems. 
In proper perspective, frequency response is an often desirable experi- 
mental description of a system and provides, on logarithmic coordinates, 
a rapid means for design of simple control systems when specifications 
on transient behavior are loose. Frequency response is perhaps best 
visualized as an imaginary axis scan on the complex plane, as shown in 
Figure 2.3, where the function is evaluated by the complex product of 
vectors from all system zeroes divided by vectors from all system poles 


to the particular s = jw point under consideration. 


JQ) 


s ~ PLANE 


Figure 2.3. Frequency response viewed as imaginary axis scan 


Since linear systems were previously regarded in terms of how 
they altered the magnitude and phase of an input sinusoidal signal, essen- 
tially a communications engineering viewpoint, it is natural that random 
processes should have been described in terms of relative frequency 
content. But now that the Laplace transform -- high-lighting the system. 
poles and zeroes -- has emerged as perhaps the best index to the prop- 
erties of a linear system, it is necessary to take the viewpoint in a ran- 
dom process that the characteristics of interest are the poles and zeroes 


of the power density spectrum O is), and not necessarily the spectrum 
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shape. A useful conception of the spectral representation, as a function 
of w, is shown in Figure 2.4, where again the magnitude of the spectrum 
is determined as the resultant of vectors from all poles and zeroes of 


Oks) to the s = jw point. Note that an auto power density spectrum 


JW 


Figure 2.4 Power density spectrum viewed as imaginary axis scan 


has a symmetrical distribution of poles and zeroes, since the relation- 
i / = =; a - 

ship Te”) en /~) becomes P As) ® s) in the frequency 

domain which means that, term for term, the LHF poles and zeroes 


must equal the RHP poles and zeroes, 


The basic tools for the examination of random process have been 
presented -- the correlation functions and their transform mates, the 
power density spectra. It now remains to specify how these character- 


istics are altered by passage through a linear system. 


2.4 A general formula for power density spectra transformations 


A derivation is made in this section of a compact expression 
of the cross (or auto) power density spectra between any two signals 
in a linear system as a function of the cross power density spectra of 
the system inputs. This resulting formula will be used consistently in 


this and remaining chapters because of its generality and simplicity. 


The general problem to be considered is pictured in Figure 2.5. 
x and y are two variables (x may equal y) which are the linear responses 
to two sets of random inputs, x, and Yq each individual input being oper- 


ated on by a system weighting function, g(t) or h(t). The desired quan- 
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Figure 2.5 General model for linear system 


tity is Q ..,(s)s the known quantities are the cross-power density spectra 


between any two of the inputs x, and ve 


4 Ae WW 


x) = 2 g(x) 5 yt) = 2 ht)* y,(0) 


A=] Ja] 


where ''* '' is a symbolic operator expressing convolution. Transform- 


ing, 
“Vu SW. 
xis) = 2 Gls)X(s) 5 Ws) = SF Hs) * Ys) 
A=} J= | 


F xy(™ a E, { x(t) ° yt +r} 
® is) = aa {9m} - E, { xt X [ vit v}} 


assuming that the integration involved with averaging in time can 
commute with the integration of the Laplace transform. The subscripts 
on the operator indicate the time variable which is used in the opera- 


tion. 


Sie 








Consider a length of signal which exists for duration 2T, where 


T is arbitrarily large but finite, and which is zero elsewhere. 


T wi 
_ lim 1 -sT 
D 4s) = mseo OT fa alt) Jt ja) me dv 


lim 1 * st 
“Tyco OT 2 dt x(t) e  y(s) 


lim 1 — 
= Teo oT x( Ss) Y(s) 


which is a standard result found, for example, in Rice i and Solodov- 


miley” 


But, substituting the values of X(-s) and Y(s), 


eee 
oe 2 G.(-s) H(s) ¥,(s) 
Ww We 
lim  X-s) YAs) 
= > > G.(-s) H.(s) i j 
A=] J=) . J ene: ye Shy 
nN SW 
“2 L G,(-s) H,(s) © oy (5) (2.9) 


which is the desired result. Several examples will illustrate the con- 
venience of this formula. 


Consider first the system of Figure 2.6. 


X (Ss) \n/ (S) Vis) 


Figure 2.6 A simple linear system 


X(s) = X(s); Y(s) = X(s)° Ws) 


immer = wis)  (s) (2. 10) 
xy xx 


=e 





a basic result which has immediate practical consequences, If x and y 
are the available input and output signals of an otherwise inaccessible 
system, the system transfer function can be determined without intro- 
ducing test disturbances by analyzing the cross-correlation between 


x and y. 


Also, 


PD is) = Dols) ws) wes) (2.11) 


Next, Figure 2.7 shows a typical summing operation. 


X (Ss) G (S) 
“Ce Z &) 
=p 

Ys) H (Ss) 


Figure 2.7 Atypical summing operation 


Z(s) = X(s) * G(s) + Y(s) H(s) 


G(s) i b is) G(-s) G(s) + @,,,(5) G(-s) H(s) + 
© (s) H(-s) G(s) + G(s) H(-s) H(s) (2. 12) 


which is obtained by inspection by performing the necessary cross- 


multiplication and observing the proper sign of s. 


2.5 Single-dimensional optimum systems 


The classical Wiener theory of an optimum linear system to 
operate on a random process will now be derived using transform ex- 
pressions wherever possible. This clear and direct approach is useful 
in its own right but is basically intended to provide an introduction to 


a similar development for multi-dimensional systems to follow. 


Figure 2.8 shows the basic configuration to be studied. The 
stationary input random signal v in general will contain a signal to be 


operated on and an extraneous noise component, The ideal signal i is 


oe 











Figure 2.8 Configuration of an optimum system 


the mathematical result of some desired operation on the signal compo- 
nent of the input, such as filtering, prediction, or some linear function 


of the signal. Figure 2.9 shows an elaboration of this structure, where 





Figure 2.9 Formation of the ideal signal 


the signal component s is operated on by some not-necessarily physi- 
cally realizable transfer function, G(s), such as l, er or s. If @.. 


and ® . are known, and since 


® (s) = O . G(s) + ® . G(s) (2. 13) 


O tS) is as equally valid a statistical description of the desired opera- 


tion as is specification of G(s). 


Thee@ror signal, e, is the difference between the actual response 
of the system to be determined, W(s), and the ideal signal. The optimum 
system will minimize the mean value of error squared, ee = P (0), 
which is a satisfactory error criteria for many purposes. The use of 
the variance of the first probability distribution of error is a natural 
choice when longtime properties of signals are being examined, asa 
more complex error criterion besides being mathematically intractable 


would require more statistical knowledge of the processes involved. mo 


=A 








aceee wel 
o = Foe - ag vf ds D (6) 


E(s) = I(s) - Vis) W(s) 


wet8) = Gils) - OB, is) wis) - Bs) wi-s) + 
7 W(s) W(-s) 


The determination of the optimum W(s) in order to minimize the 
integral expression is the standard problem of the calculus of variations. 
If a perturbation in W(s) is made, called a variation, a resulting pertur- 
bation or variation in rca cttee More formally, W(s) is replaced by 
W(s) +€ § W(s), where € is a "small" constant and the variation JW(s) 
is any allowable change in W(s), or alternately any system which could 
be paralleled with W(s). This restricts dW(s) to have the properties of 
physically-realizable and stable systems, that is, with no poles in the 
right-half plane. Also, for a finite e , dWs) must not be of such order 
as to provide a component of white noise at e when excited by v. e” is 
then expanded as a power series in € around €= 0. The optimum system 
will have been found when the coefficient of the first power of € is zero 
regardless of the form of Ws) -- in other words, no small allowable 
change in W(s) tends to decrease the value of the integral. 


“FX -2 ee 
OE . = 0 


do? : = ie £18.00) 


assuming that differentiation may be performed under the integral sign. 





The variational notation will now be shown to follow the usual 
rules for differentiation, considering the individual terms of 0 eel®) 


consecutively. 
|. | 


Hise 





> e _” = 


: 3 : 
§ [Gis ws| = Bis) 2 [ wis) + € dwis}, 9 = Bitsy Fwis) 


F[F (sy we-s)] =O (sy 2— [we-s) +6 Swe] = 8s) Swe 


J wis wi-s)| = ® is) 2 {Cms) + € £ wis)| [wi-s) > Sw -aly > 


= 0%) | fwis) W(-s) + Ws) Swi-s)| 


The only restriction on this analogy with differentiation is that the 
variation of W(s) or W(-s) must carry the proper sign of s. 


Joo 
EB 1 
aA =or= ay f B00 J Ws) - Q As) d W(-s) 
~)eo 


+ @ &) [ wi-s) d Ws) + Ws) Sin-s 


To simplify this expression, several auxiliary results are needed. 


(1) ®. (-s) = @ As), from the fact that P( SP) = P_, Equation 2.6. 


(2) The sign of s may be changed in any single term of the above integral, 
without affecting its value, since the limit exchange and the sign change 


of the differential ds have cancelling effects. 


Changing the sign of terms as necessary to be able to factor d W(-s) 
and identifying OD (-s) as G ts) and @ Xs) as  A-s) : 


Ce = mj ds © ¢ W-s) S D (s) + ® (s) wis) | 
= }00 
If the integral exists and the contour is selected so as to enclose 
the left-half plane, the LHP residues must sum to zero for arbitrary 
d W(-s), which has all its poles in the right-half plane. Obviously, the 
function } ws W(s) - Bs) must have no simple poles in the LHP, 
say ats = - a, for the sum of residues is 2 dWla,) , an arbitrary 


number for arbitrary 6 W(-s). If this function has a multiple pole, say 


‘ | th 
es a ee then JS W(-s) could be selected so as to include a m-l 
order zero at s = - a (only poles must be in the RHP), leaving the first 
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order case. Thus, it has been shown that [ D () W(s) - F is) can 
have no poles in the left half plane, or 


LZ | Bey wis) | “Ax [Bis | (2,14) 


where fz" is a picturesque operator used by smith!! to indicate the 
operation of inverting a transform into is positive and negative time 
parts (= e or the inverse Fourier transform) and using only the posi- 
tive time part in taking the unilateral Laplace transform, . Despite a 
possible question as to the uni- or bi-lateral nature of f , this compact 


notation will be used subsequently to denote the casting out of RHP poles. 


A functional equality of LHP poles, such as in equation 2, 14 
above, is not affected by multiplication of both sides by the same ar- 
bitrary transform having poles in the RHP. For example, 4 W(-s) is 
such a function. Now, 6 vyis), because of ‘i even nature, can always 
be factored into DO i- -s) OE (s), where om exis s) contains only RHP 


poles and zeroes and O. (s) contains only LHP poles and zeroes. 


“te { ghia ® yh) wis) AF Fe a Bs) } 
(s) 
$b" (s) W(s) At Te a 


f71 9 ls) (2. 15) 


1 
_ Oe ots) 


This is the desired solution for an optimum system under a 


mean-square error criterion. 
To review the derivation procedure, 


» Oo eS) was found using Equation 2.9. (2d ®, 2(5) was expressed 


in the compact variational | notation, (3) d. 6” e was —— in the follow- 
ing form: Sk - 7 fos 2 d W(-s) Bes) W(s) - Dts], (4) The 


left-half plane poles of “E8) W(s) were shown equal to ieee of g. (s), 


Sar = 





and both sides of this equality were multiplied by the inverse factor of 
t+ 

®  (-s). 
VV 


An example of this procedure is given next to illustrate the ease 
in derivation of extensions to the basic theory. This modification is due 
8, 10 

to Newton 


transducer. Figure 2.10 shows that a signal driving a fixed element 


and is an attempt to control saturation in a given power 


G (s) is 
p 

Cc 

G. (Ss) 
a € 

+ 
FIXED a 
ELEMENTS 


Figure 2.10 Control of saturation in fixed elements 


to have some linear function (G_(s) usually equals 1, s, or = of itself 
reproduced as the hypothetical signal c, which will have its mean-square 
value constrained by a Lagrange multiplier as the error is minimized 


in order to control the probability of saturation. 


SL 3} _ -h [as {58 +A S$.)} 


-)co 


E(s) = I(s) - Vis) ° Ws) ° G,(s) 
C(s) = Wis) Ws) G(s) 
aie. 
D's) - cs) - O68) wi-s) G,{-s) + Oss) Ws) W(-s) G(s) G,(-s) 
m5 7 . ~@®. (§) Wis) GS) 
Oe) = OS) wis) w-s) G(s) G_(-s) IN (3) Ge 


J D es -? (8) Gls) § Wis) - G(s) G,(-s) d W(-s) 
+ OG) Gis) G(-s)_ [ wis) f wi-s) + wi-s) f wis) | 
SOS) -OS) G(s) Gi-s) [wis)d w-s) + wi-s) Jwis) | 


J ef + ro =Q0-= : [as g W(-s) |. 2 ® (s) G,(-s) 


27} 





~ joo 


+2 O (S) [G,s) G,(-s) Ha) G(s) G_(-s)| ws 


VV 


=a 





; er 
+z 1 (Bs) G(s) G,(-s) +) G(s) G(-s)] wis)} -L# {a4-s) Bs) 


ee ya J_ Gets) By) 
BiG) [Gpis) G4es) +A Fels) G5) ]* D ks) [Gs)bes) AG) Ges) | 


where the + and - symbols indicate LHP and RHP factors. 





W(s) = 


This same result, obtained through standard time-domain tech- 
niques, requires a formidable use of tedious multiple integrals plus the 
complex reasoning behind the time-domain motivation of spectral fac- 


toring. 


It is interesting to note that factoring the input power density 
iF ~ 
spectrum, Oss) = i) (s)° ® (-s), determines the system which could 
VV VV VV 
produce the observed statistics when excited by "white noise" with a 


unity power density spectra, as was pointed out by Bode and Shenwonee 


In Figure 2.11, a white noise signal, with © &) = 1, passes 
through a linear system with a transfer function of D* Xs). ® &) = 


6” ( 5” 
vy'S) Y (-s) from equation 2.11. 


W ca we 
Pans) 
Figure 2.11 Reproduction of observed statistics from white 


noise , 


White noise is a useful abstraction, since itis a totally random 
signal having uniform energy content at all frequencies, or alternately, 
an impulse auto-correlation function. It will be one of the major purposes 
of this thesis to stress the visualization of a random process, single 
or multi-dimensional, in terms of the linear mathematical model which 
could create the process. This has the effect of partitioning the process 
into two parts: (1) The white noise excitation, which is totally random 
and thus unknowable, and (2) The hypothetical physical system, which is 
completely known and which has instantaneous internal signal levels which 
completely define the entire past history of the white noise excitation for 


future uSe. 


219 








2.6 Multi-dimensional optimum systems 


The class of system considered in this section is pictured in 


Figure 2,12. 





Figure 2.12 A multi-dimensional system 


Here a set of n input signals, v, each of which may contain a 
signal and noise component which can be correlated with any other signal 
or naise is to be processed by a linear multi-dimensional system W(s). 
The n auitputs, r, are to be compared with ideal or desired signals, i, 
and the set of differences constitute the error signals. As will be shown 
at the end of this chapter, the ideal signals result from a linear opera- 
tion on the signal components of the input signals, and specification of 
O iv, 8) for j andk=1, 2, . . . nis enough to uniquely specify 


this relationship, as was shown to be true in the one-dimensional case. 


The criterion for optimum performance is that the mean-square 


value of every error signal is to be minimized simultaneously. 


W(s) is best described in matrix notation: 


ris)| = Ws) VG) (2.17) 





where W, ,‘s) is the transmission linking the ‘4 output and the pe input, 
Consider the i error signal. 


WY 
E(s) = Us) - 2 W,,(8) VAs) 





J=| Jeo 
2 . 
“i oe wall 27j i oO. e.(s) 
~—}e0 1 1 
TZ i. (an 
Se. = y= om] ds d De, e.(s) 
I me 
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From equation 2.9, 
mM mM. 
We e(s) = @ i. i.(s) - > © i. v.(s) W..(s) a Dy. i.(s) W...(-s) 
twas ia oes: ae | iy JE lad ij 
+ ia W...(-s) i W...(s) Ov. v,(s) 
J=1 J R= | ‘ J 


; : th 
In matrix notation, let | W;(s) be thei rowof W(s). The scalar 


e. e.(s) is then seen to be expressed by 
Be. e.(s) = Di. i.(s) - ae Di. “) - Wa) Dv, i(e)| 


+ W,(-s) |G wt) | w,(s)] 
Let W.(s) be replaced by W,(s),+ €JW,(s) , where€is a scalar 


and the variation J W.(s) is an arbitrary row vector, each element of 
which satisfies the same physical realizability condition as in the one- 


dimensional case, 


d 0) e. e.(s) will be evaluated term by term to show that the 


matrix variation is found by an analog to matrix differentials. 


§ o i, ifs) = 0 
1 i 
| _ 2 Oo. ) 
Sf woo © i, -)]} ees {more fue Oi. wl} 
Ee = 
= Jf W.(s) 0) i. vs) 


sf cade Ov, 1) = = {wn + € o W(-s) ; 0) & ico} 
= id W,(-s) © 7 i(s) | Ss2 


Suico Pe] 


Ze {loycn, ¢ éuco} [Bey]: { mse] ogsmcol} 
€=0 
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_ s ll 
> 4 = =——— 
: 
* - - 
i 
—_ @ ® 


= ae? be dW. (s) + J WA-s) |B) |w,s) 


Jo 


a) 1 
e, = = f ds {-Lw Oi. v (s)] (J w,(-s) Qy. is) 


i 27} 
G8) exe w,(s)]+ Ewt-0)(Bie)}w(0 | > 


Each term under the integral sign is a scalar and can be trans- 


—J)eo 


posed at will, and the sign of s changed as was described in the single- 
dimensional case. Also, oO! (-s) = i) (s) since 0 v. v.(-s) = Ov. v.s), 

VV VV ee! iad 
Equation 2. 6, 


jee 
fe? = 0 = ms J ds Emons Di, vi-s)| - Ov, i(0)| 


(o" (-s) )| w,<s)| {gy \w.sr] $ 
- Ay fae ag 25 W,(-s) dws), {- Gv, i(s)| + +[ Bs, wis) $ 


— }o9o 
This scalar integral expression is identical with the sum of n 





one-dimensional cases and the same reasoning, element by element, can 

be applied to the column vector as was applied to the single dimensional 

case. That is, there can be no net LHP poles in any element of|® | w,(s) (s)| 
-O f i (s)| since they are separately multiplied by arbitrary functions 


dieing RHP poles only. 


Therefore, 


££ {[B, cel we] } - tt * {o v; wolf (2s 1,2 -.m) 


-1 
where thes £ operator is understood to apply to each element in the 


column vector. 


An expression involving the matrix [wis)|is thus found: 


> (6. (s)| [wisi 77° ]8 v tend 


The remainder of this work will need to express compactly the 


cross-power density spectra which exist between signals in two sets or 
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vectors in a random process. The following convention will be observed. 


®,,,(s) (s) will have an i us element Oe in Thus 


ie a {3% (S ) wie} erage {Bit ior | (2, 18) 


This is the implicit solution to the optimum multi-dimensional 
system under mean-square error criteria. In the special case of a 
single-dimensional system, the result is identical to that derived previously 


in Equation 2.15. 


Unfortunately, W(s) is not directly obtainable from this expression 
since Ow) contains poles in both the LHP and the RHP. This defining 
equation implicitly involving W(s) has been previously obtained inde- 
pendently by anaes and Hsieh and Leonidas: °, and the next section 
will outline and analyze their proposed methods of solution for this set 


of intercoupled equations. 


2.¢ Past attempts to determine optimum multi-dimensional system 


1 
Hsieh and Leondes ? employed time-domain concepts in deriva- 
tion of the optimum system. Their solution will be expressed in the matrix 
notation of the previous section. The basic problem is to determine W(s) 


from the equation 


sedi 104 whey | a et 142 io" (2. 18) 


Hsieh and Leondes added an undetermined matrix F(s) to the 





above equation so as to provide an equality of both LHP and RHP poles. 


OY . Wis) - Ow) + F(s) (2. 19) 








= 
F(s) contains the RHP poles of Pts) W (s) -o. (S) 





Thus the Z fiad operator is no longer applicable, it being understood that 
ie 
W (s) will have no poles in the RHP. 


—| 
w Gy Os) {9 ies F(s)} 





299 - 








W (s) = ee jaa B «| {o (§) + F(s) (2. 20) 
VV vi 

















EE 
where AT and A are the LHP and RHP factors of Ds) respectively 
and | adj @ s) | is the adjoint matrix of ® (s) . 
VV VV 
u 1 rom 
A” Ww (s) = — lag ci) s)| © (s) Gia lad; Bis | F(s) 
ie VV vi IG VV 


Let - [asi Bs) |P is) = HH" (s) + H (s) (2. 21) 
A a 








where H’ (s) is known and contains only LHP poles, obtained by perform- 





ing a partial fraction expansion of each element of —_ | Adj ® (5) ]@ 


H (s) contains only RHP poles. = 





Each element of [Adj Bs) Jean contain only as its LHP poles the 


LHP poles of ® &) ; — F(s) will contain only RHP poles. Thus, 


SS 
(TSateteatemtatinat erste, 


We 
1 1 1 _ 
— oD = —— 
o: | ad: G «s>| F(s) Ee st P, Cc + J (s) (2a22) 
where -P. is the (eee pole location of © iS , having an undetermined 


matrix coefficient Cc, and J (s) is a matrix with only RHF poles which is 





not considered further. Accordingly, 








0 -s + a i i 
W (s) = Ar { as “a s+P, © (2.23) 


At this point, it is claimed by Hsieh and Leondes that the un- 


determined matrix coefficients can be obtained by substituting w'(s) 





into the basic equation, 2.18. 
-] we ; 
if + 1 i | =. 
“¢ {3 GS). ar fH) + Z 5s +P. | = & g s)} 


No proof is offered as to the sufficiency of the resulting equations. 


The non-generality of this method will now be demonstrated by 


considering a particular example, a multi-dimensional predictor, and 
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showing that the resulting equations are insufficient to determine the 


1 v2 
C coefficient matrices. 


A multi-dimensional predictor example 


The input signals, Va» have no noise superimposed, and are re- 
presented by the known matrix Os). The = ideal signal is a prediction 


of the fae input signal Tseconds in the future. 


A ie eee ile 
1 l J J 

1 1S ae 
® ae = e © ai ey 


Dts) = o Gs) 








From equation 2. 21, 
+ 2 ol 1 | 
H'(s) =~ {i Adj O ss) 5, te) 
-1 ~| sT 
“ff fa* Gls) . ale 
“fe {at ef} 
From equation 2, 23, 


T 1 -1 + st a 1 i 
wis) = i {f4 (A e I)+2? wes } 


Az | 














where C’ is determined from the equality of om) 2.24, 
-/ = fest 
Lx {B6) x { te'(a* TL) +. ts. Sys tee Epp 


Next it will be shown that a partial fraction expansion of 
~! 
®, fs) Az are %7) in the poles of 6 ‘s) is equal identically to the ex- 
pansion of 43° 'fe sT Oy5) . This will be done by proving 


that 
He (ahisy ee a ee 
S=-P i 


ATG) 
which ensures that the external factors outside both the Q &. (S) matrices 








ae 


are identical for each pole. 


It is assumed for simplicity, and since this example is designed 
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to be essentially a counter-example, that only simple poles exist in A‘(s). 


a Pls) ad P(s) 





Ye So = —————— 
( QS ) 7 io p:) 
A= | ee es e ; 
At” (Awe *™) Se Ps) ¢° 2 > pr-pjye * 
ao = ee 
a) (St P..) Az] (S+ G;,) IT (Pe Pi) 
pie A+ ST) _ : 
af (ato) = us (s4 Pre) (Ss pip.) e'~7 
a" = | 
S) 5=-P) P (s) Az | (SP. ) 1 (Pr-P,) se-p 
P (-p. -E: +r Rei J 
— ty —!, dt (S+Fre ) 
A=! =. ie 
[. (Px-Bi) 6) ‘sil 


Each term of this summation equals zero when x # J 


AR (2%) oy) | = Mae ) : Ly (Pe- P, ) S er 
are) Is=-8 Hh, (feP) PER) O 
Therefore, equations involving the LHP poles of 7) (Ss), which 





1 
can determine C , are 


Wr om 
mT 1 1 i ’ 
Lt {By ARIES lz See: ]} es i. 








1 1 i - 
oe a) @ (-P) Cc “pal 
1 i —_ 
a (-P.) @ (-P.) Cc = A Gi = 1, 2, ... ™ 


where 0 is the null matrix. 


. 1 
The matrix 7> (P) ORK P.) will have non-zero values only 


in elements where the scalar 6 v. ae has a pole at s = -P.. Since 


simple poles were assumed in © ay , and since a determinant is 





formed with each separate term containing only one element from each 
column and row, it is clear that the a LHP pole will in general lie along 
one column or row of DMs) (actually a column, as will become clear 


in Chapter 3). Thus, an equation of the form 
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is patently not enough to determine ice ; 


Through the medium of an example involving a multi-dimensional 
predictor, it has been demonstrated by counter-example that the proce- 
dure of Hsieh and Leondes is not generally applicable. A method of un- 
determined coefficients is only valid when it can be proven that the co- 


efficients can in fact be determined. 


bg approached the same problem, but attempted to find a 
closed-form solution and was successful for a quite restricted class of 
multi-dimensional random processes. Unfortunately, in his derivation 
of the optimum system he chose to minimize instead of the mean square 
error of each output the mean Square value of the total sum of all the 
errors, which could allow undesirable cancellation effects between the 
individual errors and in general is not the best quadratic error criterion. 
It is interesting to note that his implicit solution is identical with that 


obtained by considering each error separately, as in section 2.6. 


Amara considered the class of random processes characterized 
by a matrix of power density spectra, ® s), which can be transformed 
to a diagonal form by pre- and post-multiplication by matrices with nu- 
merical elements, such that 


os —. u = [D,,(s) £1 


Ce raed 





where a is the Kronecker delta, (J 5, = 0, i? j; J = 1, i= j) 
1) 


he : vt [ p. ts) i] [uT] -1 





a 


| + + 
eh = A {-s) D; ,(s) 


Thus, the optimum system is given by 
wz? {vt [iced] Pio get wet 
-1 
-A# 1 is) } 
= e , + . “| eee 4471 
yee [ D:(-s) Li [D; (5) S| [U | Ww cs) Jd {v @.. oh 


If a W (s) is considered as another optimum system, the 





above equation in similar to n one-dimensional optimum systems, and 

acl I 1F Tale aera z al 1 s 
At { Ds (s) | [ut] -* w (s) “Lt Gua J. u.@ ® 
a ee -1 1 | 

= ~ tS 
_ o Da L,|2 x { pra fej) 2 2 ait | 


The requirement that the power density spectra matrix be diago- 





nalized by a numerical matrix is a severe limitation on random processes 


in general, as will become more clear in Chapter 3. 


In summary, there is no hitherto published satisfactory solution 
for the optimum n-dimensional system. The next section will consider 
amore general approach to this problem, which will yield physical insight 
into random processes and bypass the restrictions of the previously des- 


cribed methods. 


2.8 Anew closed-form solution for an optimum multi-dimensional system 


In the solution of the single-dimensional optimum system, where 


from Equation 2.14, 


az {84s ) wis) | AF “49 vite) f 


(0) oye 5) was factored into RHP and LHP terms 


Bs) =O t-s) G7 ls) (2. 25) 


and both sides of the equation were multiplied by Ts) ,» maintaining 
VV 
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-] 
thet equality. 
If the matrix PD (s) could be factored into two matrices, 


oO yi 3) = GO __(s) @ * \s) 


Ape Vc mae 








where ) (8) and its inverse contains only RHP gee it is logical 
to inquire er multiplying both sides of the 1#? matrix equality 
O ~ fs) roel preserve this identity. 


More generally, if the matrix equation is given 


J4* {ay} -44 7 {ais)} 
does ize = { cs) As) } =0% 7 f cis) Bs) } 


where C(s) has only RHP poles in every element? The ‘alee elements of 


C(s) A(s) and C(s) B(s) are, respeya ey: 
me . 
> C. Ak, and C.,, BK, 
k=] y J R=) 


From the previous arguments of this chapter, 


127) { c.,, ak, [= 12 re { Cy, Bk; f 
Ag” { ax, } Ag | Bk, } 


Obviously, the addition of n equalities of LHP poles is still a valid 


since 


equality. 


-1 
Thus it has been demonstrated that multiplying a matrix L# 
-1 
equality by a matrix with all poles in the RHP preserves the BILE « equality. 


446 8)) Os) Os) W wis) ae {[Bj0 moka B io} 
J2 tome Ww ws) = p(s (s) W" - ie) =1,° {eo (s)|"° om G, (9 
de «(Bisel ze “(iBT Boy 


In the above steps, © yl) a must contain only RHP poles, to 





————. | 15 
justify the operation under the.f# ~ operator, and a} wy (8) must contain 
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-1 
only LHP poles, to justify the removal of Ze“ ” 


Further restrictions must obviously be placed on on yy) and 


om (s). It has been shown in Section 2, 6 that OG" yi 8) uy ae -s) 





Eenctorss let o= (s) = G(-s) . Gq" (s) (2.27) 
where G(s) and Gs) are both physically realizeable. Thus 


aii | 
wis) = [aia |? ge “[ot-s)] Ey(s) | (2. 28) 
In section 2.5 it was pointed out that factoring the single dimen- 
; . + t ' wf 
sional ® As) into @ f-s) : @ As) determined ® "As, the trans 


fer function of a linear system which could reproduce the observed signals 





when excited by white noise with unit power density. This is the Bode- 
Shannon approserm™ It is natural to inquire if a similar interpretation 


can be placed on the factoring of O Ms). 





Suppose a Set of n uncorrelated unity white noise excitations, W,» 


are applied to a physical matrix filter, G(s), as shown in Figure 2.13. 


Wy 





Figure 2.13, A random process created by n white noise sources 


We 
V,{s) = ra G, ,(s) Wits) 
We We 
Q v, ts) = z G, ,(-s) 3 Gre O w. W, 
but O ww, = | l= k 
= 0 1 7 k 


Dv, vis) = = G, (-s) eA”) 


In matrix notation, 


® is) = G(-s) G'(s) (2.27) 
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which is the desired result. That is, the process of matrix factoring, 
which leads to a closed-form solution to the optimum multi-dimensional 
system, is identical to the problem of finding a physical system which 


can produce the observed statistics with white noise excitation. 


Thus, the multi-dimensional problem has been shown to parallel 
exactly the single-dimensional case in notation and meaning, if the ma- 


trix expression is substituted for the one-dimensional transfer function, 


Chapter II will present various approaches and a complete solu- 
tion to the formidable matrix factorization problem. It should be pointed 
out again that this matrix approach produces the first general closed-form 


solution to the optimum multi-dimensional system in the Wiener sense. 


2.9 Statistical transformations on random vectors 


A great similarity has been demonstrated between the scalar 
and the matrix representation of random processes. For example, @ _(s) 
describes a single random process just as O (s) describes a set or 
"Vector" of n random processes. Some of the simpler relations to be 
derived were earlier presented by cumieras but in view of the sim- 


plicity of derivation using equation 2.9, they will be repeated here. 
Consider first the simple configuration of Figure 2.14 


Figure 2.14 x Y 
G(s) /}- —> 


Multi-dimensional System 


vs) G(s) x(s) 
= ae 
¥(s) ea G(s) X(s) 


Me a 


Me 
2s. ys) - yA G,,{-s) = wir a D x, x,(s) 


} —! 


R 
DMs) = Gl-s) Oils) G (s) (2. 29) 
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In the special case where x| is a set of uncorrelated white noise 


signals with unit power density, 


O (8) = 1 


T 
verifying Equation 2.27. 


IW 
D x. y ts) = 2 So $ x. x (s) 
T 
@ i) = Bois) aos) 


Next, the summing operation of Figure 2.15 is examined 

















x 
Gis) 


f H (s) 


Figure 2.15 A Multi-Dimensional Summing Operation 


Z (Ss) = s G(s) Xs) + SH, p(s) YRS) 
Jey =| 
2, 26) « S 6. es Gin ) D, Xe ©) 
J> | Rey 
+2 Saye Hin &) By pl Z Aaxl DE 6 G6) G,, & 


J= 
rz ar ARS) Z Tig) Qyy, €) 


B® (s) = G(-s) Fe As) + G-s) G(s) H's) 
H' (s) (2.31) 


Mes) Os) GHEE © “G) 
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The preceeding configurations were examined in deliberate 
It inferentially appears 


























Similarity to the scalar results of Section 2. 4, 
that a general formula for vector random processes can be expressed 


just as Equation 2.9 applies to scalar processes 


Jao. 





2 S G.(-s) Bx. y,(s) HT (s) (2. 32) 


Az} Ja] 


” Aw. 
where the vector X(s)| = 2 G(s) x,(s)| and ¥(s)]= zZ ae ¥,(5)] : 
i 


The ij subscript of x. Y, ves moe to the i vector input, Xo making 








up x , and similarly at the j te vector excitation of y 


To prove this formula, which is believed to be the most general 
expression of statistical transformations in linear systems, consider the 


system of Figure 2.16, 





MATRIX ELEMENT 
G(s) G'pa(s) 
J 
Heh H ty®) 
X.(s) x" (s) 
i q 
J 
ce, Bi if) 
X(s) X (s) 
p 
Y(s) ¥(s) 
Figure 2.16. A general multi-dimensional system 
mM oe 
x = e Se 
(8) va (2 ool) (5) 
( > wl (sy vi (s) 
Ys) = $ H’, (s) Y s)| 
t aa tu u 


From the basic equation, 2.9. 


ou = 5 (3 at -s)) S (2 wo) Pe y’ (s) 


Az! =| J=! 
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on ne th 7 . oie leah 
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A= | A= memes 


® xyt8) > 2 


' ‘ee : EF 
G(-s) Gx! y\s) [H(s) | (232) 
a a es a as: 





With the use of this formula, statistical relationships in multi- 
dimensional system variables are swiftly expressed. An example will 
prove the previous statement that () (s) is a sufficient description of 


the ideal signal in multi-dimensional optimization. Consider Figure 2.17, 





Figure 2,17. Calculation of O _(s) 


where all variables are random vectors and all systems are matrix 
operators. 


v(s)| = s(s)| + Nis) | 
K(s) | = |G (s)| S(s) | 


T T 
O (s) = Dis) Gis) + O ls) GAs) (2, 33) 











Thus, 6 it) is equivalent to G(s) if the input statistics are 


known, 


37. 








CHAPTER III. 


MATRIX FACTORIZATION 


3.1 Statement of the problem 


This chapter is concerned with factoring a matrix of cross-power 


spectra between signals in a multi-dimensional random process. Chapter 


II has shown that solution of this problem will yield two significant results: 


(1) A closed-form solution can be found for an optimum multi- 


dimensional configuration in the Wiener sense, 


(2) A multi-dimensional linear model is determined which can 


reproduce the observed statistics when excited by a number of uncorre- 


lated white noise sources, 


The basic equation is 


=: 
0) (s) = G(-s) G (s) 
= Na 








or, in expanded form, 


© viv,(s) ® Vv v,(s) ee ® v,v,(s) 


1 
Ovev ls) 
Oy v(s) ie: © VV As) 
G,,(-s) G,,{-s) Ce G,Af-s) G,,(s) Gos) 
G,,(-s) , ee oe G,.(s) 
G_4{-s) ; G_ni7®) G,,{s) 


ane 


(2.27) 


The G(s) which is found as a result of the factorization process 
is the matrix filter described in (2) above. Each element of G(s) and 
G(s)" must be physically realizable in order to meet the requirements 
given in section 2.8 for the solution of an optimum multi-dimensional 


configuration. 


a a Realizability considerations 


Before plunging into a solution of this thorny problem, it is nec- 
essary and useful to examine the properties of h (s) which characterize 


a set of random processes which could actually be found in the real world. 


The “ element of Ds) is Ov, iv,{s). where v. and We are mem- 





3 of an n-dimensional random process. Since Ov. V {- s) By, V. As), 
mG 
yi 78) = GAs). 


1 
In addition, Kraus and Potzl ? have proven that a necessary 





and sufficient condition for w) wy) to represent a valid multi-dimensional 
random process is that D (je) be positive definite for allw. This arises 
quite naturally if the n signals are allowed to pass through a system G_ 
which multiplies each signal by an arbitrary constant and sums the total. 


The power density spectrum inw of the single output is, from Eq. 2.29, 
, G [Fier] 6| 


This spectrum must have a non-negative value for all values 
of w, since a negative mean square value of power density cannot exist. 
Thus O (jw) must be non-negative definite for all values of w. The 
special case where I@,, ie) equals zero for all values of w will be con- 
sidered separatedly in section 3. 9, and a positive-definite limitation on 
® yy ie) will henceforth be considered a valid demonstration of the rea- 
lizability of the random process. As will become more clear in the re- 
mainder of this section, the only other case where a zero value of power 


density can occur at a finite value of w is the occurrance of a multiple 





even-order zero on the jw axis in Bs) 


ee 





Positive-definiteness is a property of a matrix which is capable 
of a number of separate verifications. For the purpose of this theory, a 
particular method indicated by Bela is preferable. He states that 
a necessary and sufficient test for positive-definiteness of a Hermitian 
matrix is that each of the diagonal elements be positive and that the de- 
terminant be also positive. In the power density spectra application, 
this criterion means that the power density spectrum of each of the n 


random signals must be positive, as well as |® me) , lor eliza: 


It is interesting to relate these requirements to known properties 
of the auto and cross-correlation functions. For simplicity, the 2X2 case 


will be examined, 





12 


® (-s) ® Xs) 


The requirements for realizability are that ri) 1: D iv), and 
® bio) O ,. (50) = 6 193) ® -jw) each be greater than zero for allw. 


eo! {5 0} ee) toga) 


On ee 


: ] 
Newton, Gould,and Kaiser : have presented some physical rea- 
lizability requirements on the correlation functions, derived from ini- 
tially setting the square of a linear function of the signals equal to or 


greater than zero: 
Yii (0) > Br) UWe12) -eatec (3.1) 


¥i1(0) P.2(0) 2 | Pin (r)) ae) Sos (3.2) 


A relationship between the power density spectra and the corre- 


lation realizability requirements will now be derived. Eq. 3.1 can be 


Be (ee 








expressed fori=1, as 


joo ed) 
1 1 st > 
27} ic d 405) " Ory fas i @ hs) = 0 





Replacing s by jw, 


Oo 


1 TEs a : 


= CO 


Since the real part of 1 - el” a 


is always equal to or greater than 
zero, this integral will always be greater than zero if D, (iw) 7 “W ion 
allw. This relates the positiveness of O, (jo) (or ®, (ie) ) to the fact 
that a signal has the highest correlation with itself as opposed to any 


time-shifted version of itself, 


At this point, it is well to ask if the oe oa of Qs) can be 


determined by inspection. It is not enough that Q,, = G, ,(-s). For 
2 

example, OQ, ,(s) = (aria satisfies wail a vielen but 

-w +3 


3 is negative for w >[z , 
o +4 


The example above contains a on pair of zeroes on the jw 
axis and is not factorable into OK ». OF - -s). This pair of simple 
zeroes are the only factors for a. 10 = O - -s) and which cannot 
be factored with mirror symmetry about the jw axis. Thus, factoriza- 


tion is the only realizability requirement for a single power density spectra. 


The second correlation function inequality, Eq. 3.2, may be 


written as ee 
m) | 78 2uG) > a f4e Be) 
—)eo —jeoo 
+ fas, Beye”. 2 4 CGE) 2 
211) (S, ) =a Sees 
—j)oo Jd 


or, replacing s by jw, 


38. 








dy | [tm du, [s, vw) O, vr) - € a “ Uw) ) Bey) | z0 


_ od —oO 


The maximum value of ely 2) D 12'I%4? P o(-ie, Postar tees 
when , = W, = w and |B 1 (ie) | is amaximum, Therefore, the minimum 
value of the integrand is 


D (5a) B,o(50) - Bo(-io) B, Ww) 


for some value of w. If this integrand is positive for all w, which is the 
previously given realizability criterion, the integral will always be pos- 
itive. Positivity of this integrand is again equivalent to factorizability 


of the 2X2 ® yt 8) | , as was true for the auto-power density spectra. 


In summary, the realizability criteria found in the literature for 
the existance of the correlation and spectral functions of random processes 
are related and the factorability of the individual power density spectra 
and the matrix determinant is enough to satisfy all requirements. The 
reason for the emphasis on this matter of realizability is that any method 
of finding a real system which can create the observed statistics must 
fail when either the diagonal elements or the determinant of Ps) 
cannot be factored, for otherwise the paradox of unrealizable signals 


being created by a realizable system would exist. 


3.3 Two special cases 


In this section, two special matrix configurations will be examined 
which can be readily factored. These particular cases are of importance 


Since they provide goals for a more general factorization procedure. 


When Ds) is a diagonal matrix, each element must be able to 





be factored into LHP and RHP terms, as shown in section 3.2. There- 


fore, 








262 


where D(s) is a diagonal matrix containing the LHP factors of all the 


diagonal elements of ® (s). 





The second example of an easily factored matrix is the numerical 
Hermitian matrix. eon has investigated this problem and has proved 
that a solution always exists providing that the matrix is positive definite. 


The problem is to factor H into X , <a where X is a numerical matrix. 


Lee shows that a canonical triangular form exists for this problem, 


a1. ee, where T is triangular and an entire family of solutions is 


= 


generated by T. U. aii ; ae where U. U' = L or U is a unitary matrix 

with real elements, In illustration of this result, suppose H= i : 
Th 0 

and eu 
me "29 

The elements of T can be solved for consecutively because of the 
; ee +) 
triangular form, yielding (ate 0 413 Ts 


_ ey 5 1 1 
2 fay ayets 


© 
— 
aN) 


A general form for a 2X2 unitary matrix is 


Q. ey: 


U = -lcadz | (3. 3) 

This single degree of freedom reflects the difference between 
the number of unknowns, 4, and the number of independent equations 
which can be written, 3 (as the symmetrical form of T leads to identical 


equations for transpose pairs off the main diagonal). In the general case, 

n(n-1) 
2 

tion problem. 


bounded variables can be adjusted independently in the factoriza- 


The particular significance of the numerical case is that the 
general factorization procedure to be presented in section 3.5 will re- 


duce in the last stage to a matrix with only numbers. Another perhaps 
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more conceptual use of this special result is to visualize a matrix ® (jw) 
as a Hermitian matrix which can be factored for every value of w, pro- 
viding that the matrix remains positive definite (the realizability require- 


ment), and thus a matrix which is some function of w does exist. 


It might seem at first approach that a triangular form could be 
postulated for @ _(s) factorization, in analogy to the numerical case. 


This is unfortunately not true, as will be demonstrated below. 


Referring to the general two-dimensional case, 


®(s) G(s) G,,(-s) 0 G,,(s) G(s) 


© ,f-s) O,A(s) Gol-s)  G(-s)]| 0 G(s) 


—_ 


+ + 
Giy(-s) G(s) = Gls) = OF is) BL te 


suppose that 


+ 
G, ,{s) = D ls) 


G, fs) G,,(-s) = O ot-s) 
Gy ,(s) = OAs) 
G, ,(-s) 

If Gs) has its zeroes in the LHP, Go ,(s) will have these as 


poles in the RHP. If G, 1's) had been selected to have RHP zeroes and 


LHP poles, the inverse matrix G(s)? would be physically unrealizable. 





1 
0 
G..(s) 
re) : 11 
: Gy ,{s) 1 
G,,{s) G,o(s) G,o(s) 


Accordingly, the triangular form does not yield both a solution 


with a realizable and inverse realizable G(s). However, it offers a use- 


aay = 





ful method of reproducing a multi-dimensional random process in an analog 
computer where inverse realizability is of no concern. To assure a rea- 
lizable G(s), the elements of which may be solved for successively, itis 
only necessary to select the diagonal elements of G(s) with RHP zeroes 


and LHF poles. 


3.4 Properties of matrix transformations 


The next section will present a general method for solving the 


matrix problem 


Outs) =sGes) Sea) (2. 27) 
VV 











The philosophy of approach will be to multiply @ is) by a suc- 
cession of simple matrices, transforming it at every step, until the nu- 
merical form is reached. In this section, the properties of simple ma- 
trix transformations will be presented, emphasizing the viewpoint that 
a matrix multiplication can be used as a tool to mold a given matrix 


into a desired form. 
There are three basic matrix manipulations to be considered: 


(1) Multiplying a row by a function of s and adding it to another 
row. 

(2) Multiplying a row by a function of s. 

(3) Exchanging rows. 


In the above list and in the discussion to follow, operations on 
rows by premultiplication are investigated. The results are equally 


applicable to column operations through post-multiplication, however. 


First, any row operation on a matrix can be accomplished by 
premultiplying the matrix by an identity matrix on which the desired 
row operations have been performed. The properties of interest in these 
transformations include the value of the determinant of the transforming 
identity matrix, and the realizability and inverse realizability of this 


matrix. In this particular application, as will be shown in the next sgec- 


By. ee 





tion, row operations are performed with matrices whose elements must 
have only RHP poles and whoSe inverse must also only have RHP pole 


elements. 


(1) Multiplying a row by a function of s and adding it to another row. 





i 0 0 0 

T(-s) = 0 1 0 0 
0 0 1 0 

A(-s) B(-s) C(=-s) 1 


The above matrix multiplies the first row by A(-s), the second 
row by B(-s), and the third row by C(-s), and adds the total to the last 
row. | T(-s) = 1. 








i 1 0 0 0 
T(-s) =| 0 1 0 0 
0 0 1 0 

-A(-s) -B(-s) -C(-s) H 


The simple form of the inverse will result for all matrices which 
add to or from only one row. If A(-s), B(-s), and C(-s) have RHP poles 
or no poles the matrices are proper for this application, regardless of 


the location of the element zeroes. 


(2) Multiplying a row by a function of s. 


1 0 0 0 
a8) = hg 1 0 0 
0 0 1 0 
0 0 D(-s) 
The above matrix multiplies the last row by D(-s). | T(s) | = D-s). 


age 





a 1 0 0 0 
To —s)5 P= 0 iL 0 0 
0 0 1 0 
0 0 0 a 
D(=-s) 


I(-s) must have both RHP poles and zeroes to be a proper trans- 


formation. 


(3) Exchanging rows. 


1 0 0 0 
| Fe aalifo 1 0 0 
0 0 0 1 
0 0 1 ah 
The above matrix exchanges the third and fourth row. | T =" =e 
a4 aa 
T =o k 


The matrices described above perform simple transformations, 
possess simple inverses, and in the second case can modify the deter- 


minant of the transformed matrix by other than a constant. 


3.5 Matrix factorization: A general solution 


A procedure is to be described in this section which will always 
yield a solution to the matrix factorization problem regardless of order, 
providing realizability criteria are satisfied. Because of the complexity 
of the problem, no easy solution appears to exist. However, the method 
of factorization to be presented here has been broken down into several 
separate phases with each phase consisting of simple matrix transforma- 
tions and each having a well-defined goal. 

Each transformation step can be presented in the following fashion: 


1-1 i 
Tie . SOME) T. (s) - © ‘s) (3. 4) 








an 


a § —<==—@ © 4 Ge « « 





The relationship between the pre and post-multiplication matrice 


; T | 
is specified in order to ensure that @'(-s) - Ib %s) | for all i, 


The overall objective of this procedure is to produce a matrix 
r 
with numerical elements, oO , after a number of successive transfor- 


; yan ° 
mations. Thus, if Qs) = © (s) 








Tp(-s). 2. . T,(-s) T,(-s) OAs) T, 


omen me ee Oe 








T . 
can be factored into two numerical matrices, N. N’, Inverting, 


Ds) = D%(s) 

















G(s) = T,(-s) . T, (s) Tp(s) . N (3. 5) 
G(s) = N. THs)... TAs). Ts) (3.6) 











A proper solution of the problem will yield a ‘seme mane -realizable 
G(s) and im This will obviously occur me T . As) and T, ea) have LHP poles 


only for alli. In other words, as im Lg) is pe een into various con- 





figurations the realizability requirements on the solution will be met if 
each transforming matrix meets these requirements, Drawing on the 
results of section 3.4, the following constraints exist on the elementary 


matrix transformation T.(-s) : 





Ci) lf T,(-s) multiplies one row of 6 ts} with a function of s and 








adds it to another, this function must have no poles in the LHP. 


(2) If T.(-s) multiplies a row of ® 1 with a function of s, this 


function must have no poles, or zeroes in the LHP. 





since the equation 


@ (s) = T,(-s) 6 is) T, Tags 


is in the form of equation 2, 29, T(s) can be interpreted as a physical 














M5 = 


~1 -1 
T (4s). TCs)... Tyg) at xf ts) . T, 7, (5) — 








system with an input random process having a matrix of cross power 


a 
density spectra © : (s), and with an output spectra of 0) “(s) . The suc- 





cession of matrix transformations then is equivalent to cascading a 

series of physical systems until an output spectra involving only white 
noise -- the numerical matrix N. no - is achieved. This white noise 
random process is operated on by N- to produce a unit-valued uncorre- 
lated set, whose spectra is given by the identity matrix. The total cascaded 
system thus operates on the given random process and produces uncorre- 


lated white noise, and is naturally envisioned as the inverse of the hypo- 


thetical physical system creating the random process. 


There are three general phases to this matrix factorization solu- 


tion: 


(1) Pole removal. The pole removal phase starts with the given 
matrix and removes the poles of every element. 

(2) Determinant reduction, The determinant reduction phase 
converts a matrix with polynomial elements and with a determinant which 
is also a polynomial in s, into another matrix which still has polynomial 
elements but which has a unity determinant. 

(3) Element order reduction. This phase operates on a matrix 
of polynomial elements having a unit determinant until a numerical ma- 
trix is reached. 

To illustrate the central ideas of this method, a 2X2 example of a simple 
yet non-trivial case of matrix factorization will be solved. Then, the 


general case will be examined and each step justified. 


EXAMPLE 


Suppose a simple two-dimensional system is given by 








1 
G(s) ~ at Z 

1 

s+ 4 
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and is excited by two uncorrelated unit-valued white noise sources. The 


matrix of output power density spectra is 








~2s"+ 13 -2s~ + II 


- o- T . pba mS 3 eee 
® is) “ee: 2 (- $+2) (-St3)(S+2) (S43) (—S+2) (-S#3) (St!) (s+4) 











seo || —25" 417 
(—St1) (~S+4) (+2) (S#3) (S41) (544) (S41) (544) 


The inverse of the matrix G(s) is 





ee 


1 1 
Gis) 1 - (stl) (st2) (s+3) (s+4) st+4 +2 
—— (= 4s + 10) 
Dati ee a 
stl st+3 


which is unstable or unrealizable. Thus, the question is posed: Cana 
matrix G(s) be found which is realizable and inverse realizable, andis a 


solution to 


G-s) G(s) = O is) = O%s) 2 











(1) Pole removal phase 


The objective of this phase is to remove all the poles in every 


element. This is quite easily done by row and column multiplication. 

















P(-s) (-s+2) (-s+3) 0 

0 (-st+1) (-st+4) 
B (s) = T,(-s) © (s) T, (s) = - pee +13 - ee + 11 
Meee 11 rae, 


(2) Determinant reduction phase 


In this phase we first desire to manipulate the matrix so it has 


a determinant which is constant and independent of s. This will be the 


Ae 












































T 1 
case if [T,(-s) | : | T, (s) B's) 
Hence, 
IB) = - 16 oe 100 = (- 4s +10) (4s + 10) 
T (-s) = 1 0 
2 
0 a 
-4s+10 
2 
2 -2s +11 
2 1 as 7 -2s +13 a ee 
d ts) = T,(-s) ® (s) a (s) = 4s + 10 
og _b gS 17 
- 45 + 10 (-4s+10) (4s+10) 
; 2 
It is now desired to remove the poles in @ (s) without affecting 
its determinant. An adding transformation is thus called for. 
2 
2 aes 7 ~ oto 
O 5,5) - -45 +10 me 8 -s+2,.5 
If the first row of oO 2(5) is multiplied by ses and added to the 
second row, the pole will be cancelled if 
ce. (Si2teo sme = +375 
s = 2.5 
or k = .75 
The total added quantity will be 
partic) 2 a . oad 
[ars (-2s +13) = 1.50s+ 3.75 + amo 6 
1 0 
T(-s) = 
Bei) 
-st+2.5 ! 
Ete) = T(-s) Os) 7. Te) : 5 rake -2s+5 
® 3 3 
2s+5 2 
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7 if T T 
2 | 744-8) 730s) © (s) | : r, (3) : ie (s) 
1 


i 
Siauo. (-4s+10) (4s+10) . [ae = 1 




















(3) Element order reduction phase 


Consider the array of the highest-order powers of s in each 
element of 6's) : 





ae -~2S5S 


2s 2 


Note that the first row is equal to the second row multiplied by -s. 


This is no accident, and arises because the determinant is independent 


of s. Performing a reduction transformation, 

















T ,(-s) = 1 S 

0 1 
Os) = T,(-s). Gs). Ty's) = 13 5 
5 2 


(4) Solution 


This numerical matrix is, oddly enough, the example considered 


in section 3.3, for which the canonical triangular factorization is 


5 
gs 0 13 = 
a VTS 
ies 5 1 


From equations 3.5 and 3.6, 


=f =i 
G(s ‘ . T, . T, (s) . T (s) . N 








@its) - nN" . Ts). Ty(s). TAs). Tyls) 


~AG 











5s +13 S 





1 (s+3) (s+2) (s+3) (s+2) 
G(s) = Sve 
5s+11 s + 10 
(st+1) (st+4) (stl) (st+4) 
4 (s+10) (s+2) (st+3) - s (st1) (s+4) 
1 
ete? 52 (st2. 5) 


~ (58411) (s+2) (st+3)  (558+13) (st+1) (s+4) 


This example has illustrated the significant features of the general 


factorization procedure: 


(1) Poles removed by row and column multiplications, 

(2) Factors removed from a determinant of a polynomial matrix 
through successive introduction and removal of the inverse factor as a 
pole. 

(3) Reduction of a unit-determinant polynomial matrix by operat- 


ing on the highest powers of s in each element, 


The general nxn case will now be examined. 


(1) Pole removal phase 


In the previous example, all RHP poles were identical in a single 
row, and al) LHP poles were identical in a single column. This configura- 
tion facilitated the efficient removal of these poles by row or column mul- 
tiplications, but did not occur coincidentally. In the general case the ike 

th 
element of s)is G.(-s) G.(s)} , wher s), is the te row of G(s 
D (8) 18 G(-s), Gs) eG,(s), G(s). 
All elements of thei row of yt) will have the same RHP poles, 
which are the poles of G.(-s) , ms the LHP poles in the j i column of 
Laie ally 


2 (s) will also be similar, except for occasional cancellation effects 





in both cases. 


~50- 








(2) Determinant reduction phase 


The resulting O(s) matrix, which has elements which are poly- 
nomials is s, must — factorable determinant with the RHP factors 
a mirror image of the LHP factors about the jw axis. If not, the random 
process is not realizable according to the discussion of section 3.2. Con- 
sidering the RHP factors, there is in general a constant and a number 


of not necessarily distinct zeroes in this determinant, 


Suppose that one determinant factor, -s+a, is selected. The 


transformation matrix 








1 0 
T(-s) = 1 
1 
-sta 


divides each element of the last row by -s +a. Let each of the resulting 
last row terms be expanded by partial fractions. The residue of the pole 
term in the aie element is Q,,, a). The important question now under 


consideration is: Can each of the first n - 1 rows of D(s) be multiplied 


by a term i and added to the last row so as to eliminate simultane- 
ously all “ake the poles in the last row? 
th th i’ g ij? 
The added pole from thei rowinthe j columnis —————— . 


-Sta 


Accordingly, the equation to be solved for n - 1 values of k, is 


m-| 
> k. ® ; (a) = - 9 fa) Gead, 2... 1) 
Az] 
This in effect requires that the last row be a linear function of 


the first n - 1 rows of D(a). Since 1® (a)| = 0, because - s + a isva 


factor, the last row of O(a) is always a function of at most the first 





n - 1 rows and the above equations can always be solved. 


The n - 1 element vector | is found from 


me 





© n-1 (9) «| : - © - a) 


where D a) is the square matrix of the first n - 1 rows and columns 


th 
of O(a), and G (a)| contains the first n - 1 elements of then rowof 
G (a). 


The pair of premultiplication transformation matrices is thus 


= © ee! | O 
O = O 
Ri Ra ara: ae | A ' R ae | Rn=| ne 
—Ste ~StaA —Sta = ~Sta_| STOemeC EA “SIA, “SIA 
a | - 
O 


From a computational point of view, K | should be determined 
and first used to transform the polynomial matrix with the right hand nu- 
merical matrix in the last expression above. Then, the - s + a factor 


should be removed from each element in the last row by synthetic division. 


The same transformation, only with the transposed LHP matrices 
Ds) Thus, the 


has been decreased by two. This procedure can obvious- 





in post-multiplication, will remove the s + a term from 


® (s) 


ly be iterated for all factors, single or repeated, until the determinant 





order of 











is only a positive constant K. Then, multiplying the last row and column 
by Ke will produce a matrix with polynomial elements in s and a unit 


determinant, 


The only case in which the procedure will not be applicable is when 


the last row and column of O (s) is zero except for the diagonal element. 





But in this configuration, the diagonal element can always be factored 
and the problem immediately degenerates to ann - 1 factorization prob- 


lem. 
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In summary, it has been demonstrated that a polynomial matrix 
can always be reduced by simple transformations to a form which has 
a unit determinant, independent of s. This is an original contribution 
to the general theory of matrices with algebraic elements, and is the 


key to the solution of the matrix factorization problem, 


(3) Element order reduction phase 


The starting point of this phase is a matrix having a unit determi- 
nant and the goal is to produce by successive transformations a numerical 
matrix. An algebraic matrix having a constant determinant is called in 


the monumental work of Gubiee an "impotent" matrix. 


Cullis proves that any unit-determinant impotent matrix can be 
obtained by successive multiplying~and-adding transformations on an 
identity matrix. Since the inverse of these transformations always exist, 
this means that there exists at least one set of transformation matrices 
which can operate on the given impotent matrix to achieve the identity 
matrix. Unfortunately, no method has been previously presented for 
determining this sequence but the procedure to be given next appears to 


be completely general and achieves the desired reduction. 


Suppose an array is formed of the highest powered terms in s of 
each element. Obviously, the terms in the determinantal expansion which 
have the highest power of s will all be formed from these terms and must 
Sum to zero because the determinant is independent of s. In this array 
identify the terms which make up the highest power of s in the determi- 
nant. Replace the other terms in the array by zero. For example, suppose 


the highest terms are 


1a ae a: 2 a 
25° El ae 6s 
ane -6s5s ee 


The highest power of s in the determinant is 5° Replacing the 


terms not involved in the ge term by zero, 
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ee ee ee | > 





mse ggee 0 
ee _ hee 0 
0 0 - 2s 


The determinant of this matrix must be zero, so one row can al- 
ways be expressed as a function of the other rows. In other words, a trans- 
formation can be readily found which will reduce the highest power of s 
in the determinental expansion, In the above example, this transformation 
is obviously performed by multiplying the second row by - +s and adding 
it to the first row. 


Iteration of this reduction of the highest ordered terms can be 


continued until no element contains a power of s. 


In the special case of the 2X2 impotent matrix, the determinant 
of the highest powered terms of all four elements is always equal to zero, 
and thus a series of simple operations of multiplying one row and adding 


it to another will speedily reduce the 2X2 matrix to numerical form. 


(4) Solution of the numerical matrix 


The only requirement that a solution exist to the factoring of the 
resulting numerical matrix is that it be positive definite. In the preceding 
steps, the factorizability of the determinant was the only realizability 
criterion needed. If one of the original diagonal elements had not been 
factorable, this would not in general have impeded any of the steps up to 
this point even though it would indicate an unrealizable system. However, 
referring to the matrix factorization procedure as a succession of linear 
systems operating on the random process, as was discussed in the be- 
ginning of this section, it is obvious that "unrealizability" and "realiza- 
bility'' are both properties of a set of signals which are not affected by 
passage through a linear system. Therefore, a positive definite numerical 
matrix will result if the original power density spectra matrix satisfied 


the realizability criteria. A non-positive definite matrix implies a set 
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of white noise having imaginary auto or cross correlation. 


Appendix II gives a complete solution to a more complicated 3X3 


factorization example. 


Summary 


This section has presented a general method for factoring a matrix 
of power density spectra, providing that the statistics arise from a multi- 
dimensional random process observable in the real world. Alternately, it 
has been proven that a linear multi-terminal system, excited by white 
noise, can always be found which (1) is stable, (2) has a stable inverse, 
and (3) reproduces the observed statistical interrelationships in a random 


process, 


3.6 Matrix factorization: An iterative solution 


The method presented in the previous section is always valid, and 
invariably leads to an answer which satisfies all requirements, This sec- 
tion discusses an iterative procedure which will often yield a valid and 
speedy solution without the need to determine and factor the determinant 


of ® (s). This becomes especially valuable when the dimension of D As) 





is high, and when digital computers are used. 


The pole removal phase of the general procedure is readily accom- 
plished, and the real factorization problem deals with the resulting ma- 
trix with polynomial elements. Let this matrix be designated as ® (s), 


which can be expressed as a power series in s with numerical matrix 








coefficients 
ll. 
Gis) = 2 ss O | (3.7) 
=0 


The problem considered is to find a matrix H(s) which satisfies 


SS 


the equation 


Ma). H (a) = Mel (3. 8) 





2G 





where WM". 














H(s) = z - H,. (3.9) 
R=0 ———. 
= ok k 2 ie 
H(-s) moe = (3 (-1) s i, (s s) 4H. 
R=o ee J=o anata 
Equating coefficients, 
O, = B (-1)" 
y om le (=1) Hi. ° H r. k (3, 10) 
where the range of k is bound by Omak sm, CwSs- Rh S7re. 


The matrix factorization problem is, as an alternate interpreta- 
tion, 2 m+ 1 non-linear matrix equations. Suppose an approximate solu- 
tion, H(s), is known. If a small perturbation in H(s) is made with aH(s) 
and the resulting product is to equal Q (s), 


k T 
9, 2 (-I) (A, + dH) (Hy, +4 Hy) 


Neglecting the product d R * dH r= j 28 of second order, the 


proper perturbation of H(s) is given by solution ofthe linear equations 
k ij me HF 7 


The left-hand side of this equation is recognized as the matrix 
coefficient of the ‘i power of s in the error: ® (s) - H(-s) . H(s). 
After these equations are solved for aH. , the remaining ee error will 


be 


=a, 


' 
Zam, - a 


and the procedure may be iterated until the error becomes negligible, 


providing that the original guess was "close enough", 


Besides needing an approximate solution to commence this pro- 
cedure, another drawback is that the resulting solution H(s) is not guar- 
anteed to have a realizable inverse -- thatis, H(s) may contain RHP 
factors. To handle both of these requirements, a good initial solution 


for H(s) will often be the LHP factors of the diagonal elements of D(s). 
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This first trial, while obviously in error if there are any non-zero off- 


diagonal elements in 6 (s), will be close to the solution if the cross- 





correlation among the signals is weak. Also, it definitely has a deter- 
minant which has all LHP factors, which a small perturbation in the co- 


efficients of H(s) will not appreciably modify. 


Having some promise of solving the matrix factorization prob- 
lem with successive linear equations, it is useful to consider these equa- 


tions in more detail. The set of equations to be solved is, from Eq. 3.11, 


° | : a _ OfLRSm 
P_ = re (-1) (4H, : Mic dH rk OLRSMm 


2 (T= O,) 6. amy) 
where p. is derived from = 
gy T s fg € 
@ (s) - H(-s) H (s) = oo es 
,=O Sse 











The new H(s) equals the original H(s) plus dH(s). 


The total number ot independent variables is the number of in- 
dependent elements of 6,. For r even, where oe is symmetric, these 
are the diagonal and above- ~diagonal elements. z r odd, where Gi 
skew-symmetric with zero-valued diagonal elements, these are the e above- 


diagonal elements. The total number of independent elements is thus 


(m + 1) (n) (n + 1) (m)(n) (n - 1) 


E 2 Jenin - 1) 
5 + 5 = (m+i1)n -~— 


2 
The number of unknown variables is the number of coefficients 
of qdH(s), which is (m + ab THeeereree ms 1) 


2 
can be arbitrarily selected, which reflects the degrees of freedom of 


elements of dH(s) 


the imbedded numerical matrix in the complete rigorous solution, One 


way of removing this excess is to specify that dH. be symmetric. 





To illustrate these ideas and to indicate the expected degree of 
convergence, the sample problem solved in section 3.5 will be re-solved 


iteratively. 


After the pole removal phase, 


aS 7 








EO ee ie -28°+11 


® (s) = 


oy ee ee oNee 17 





The assumed solution for H(s) is the LHP factors of the diagonal 
elements of D (s) 





1.414s+ 3,61 0 
H(s) = 
7 0 1.4148+4,12 | 
3.61 0 1,414 0 
H. = H, = 
— 0 4,12 = 0 1,414 


The equations to be solved are, from Eq. 3.11, 











e B At 
@, = dHH.+ H, GH, 
fee we +n dat -@n fe we deedn 
1 oy 4E St) 1 O _i O 
Ge _ T T 
| ee er ia has © 

e 0 ~2s*+11 

OD “(s) = 

-2s8*+ 11 f 


As an example of the appearance of these equations, 


: O O o 
a [0] (aay) dhj, day, | |3.61 0 : 3.61 0 | dh 


1, 0 oO © 
11 = [0] dih,, adits, og 4et4 | 0) 4.12) \dne 





where dH, was selected as symmetric. The boxed elements of Q& in- 


iY 


dicate a set of independent equations. This set of equations can be solved 


j i j : = 3 -_ = = 
directly, yielding dh, QO, dhoo 0, dhis 1,424, 


Solving the four remaining independent equations in o°, and o,° 
yields om 
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dh 


22 





0 .660s + 1,424 


dH(s) = 
~153s + 1,424 0 


The new H(s) is thus 











1.4145 + 3.61 660 s + 1. 424 
2753 s+ 1,424 1,414 s+ 4.12 
T - 2.435 8° + 15.03 oe 4 1 
H(-s) H(s) = i : : 
- 2 we + 11.02 ~2. 568 ie + 18.93 
D 
e .435 ss - 2.03 - ,02 
® (s) = 
~ ,02 + .568 8° - 1.93 


Repeating the solution of the seven linear equations, the new H(s) 








is given by 
122s + 3.285 ~1456 s + 1,5348 
.913 s+ 1.5348 1,128 s + 3,844 
Ag 2 2 
H(-s) H' (s) = - 2,047s +13,15 -1,.957s -,.012s+10.95 
- 1,957 3° + .012 s + 10.95 - 2.105 ac + 17,16 


which is compared with the actual O(s) 


~28°4+13 ~2s8°+11 
® (s) = 


eee TH ~28°+17 





This solution is probably within the accuracy of measurement 
of O(s), and no further iteration is made. | H(s)| = ,697 ts + 5,862 s + 


10.66 which is stable, 
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In high order problems evaluating and factoring the determinant 
of H(s) can be a very difficult step. If an indication of inverse realiza- 
bility is desired, an approach similar to that used by the Nyquist stability 
Ga is evaluated, possibly by a digital 
computer, for a sequence of various values of w and plotted on a complex 


criterion is very useful. 


plane. The presence of RHP factors will then be detected by any net num- 


ber of encirclements of the origin. 


To summarize, the described iterative method presents an attractive 
alternative to the complete factorization procedure, especially when digital 
computation is employed. In an example of this method, two iterations 
solved the problem to an acceptable accuracy level. The price which must 
be paid for this computational advantage is the possibility of a non-con- 
verging solution or one which converges on a solution having an unrealiz- 


able inverse. 


3,7 Matrix factorization: A lightning solution 


This section considers a very special case of matrix factorization, 
but one which is quite simple to solve. The central requirement is that 
each non-zero element of any single row of G(s), where G(-s) . G*(s) = 


@ _(s), must have separate and distinct poles and must have a deno- 





minator of higher order than the numerator, 

4a 

SG. (-s)6,,6) = Fis) 
R=) Qi (5) 


The first question to be considered is whether each of the n terms 


he iia off-diagonal element of ® As) is 





G.(-s) oe can be recovered from a knowledge of As . Alter- 


nately, if a partial fraction of EN 1s made, Q, 8) can all 
poles belonging to a single Q,,§s) element of G(-s) or qtis) be 





grouped together, and can these groups be further separated into LHP-RHP 


product pairs ? 


The key to this grouping is that any scalar function A(-s) B(s), 
where A(-s) and B(s) have RHP and LHP poles respectively, has an in- 
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verse time-domain transform £0) which is continuous across the origin 
as long as the order of the denominator is at least two degrees higher 
than that of the numerator of A(-s) B(s). This can be proven by showing 


that £0") = £(0 ) which, by using the contour integral to sum residues, 





becomes 
om f ds A(-s) B(s) = or fas A(s) B(-s) 


The latter equation is valid since the right hand side is merely the 
left-hand side with the sign of the integrating variable changed, as the 


negative sign of the differential is cancelled by the limit exchange. 


As an example, let 


3 


A(-s) B(s) (-s+7)(st+2)(st+4)  _ 


AT) = 5 Sa) 
i - 2 3 - 4T 
oR e TT. e (T 2) 
f(o ) Basie) = “sy 


Thus, the residue of the LHP poles must sum to those of the RHP 


poles in the partial fraction expansion of any such function as A(-s) B(s). 
P..(s) 
1J 
Q. .(s) 


to show this residue equality between, in general, n séts of LHP and RHP 


Therefore, the partial fraction expansion of can be grouped 
poles, providing that all elements have distinct poles. If i = j, each LHP 
pole has an equal RHP pole in the partial fraction expansion, and this 


grouping is impossible. 


Suppose that n such sets of RHP poles have been determined in 


one element of the first row of ® As). Under the assumptions of the 





form of G(s), these sets should satisfy residue equality requirements 
in every off-diagonal element of the first row of D As). The first diagonal 


element is similarly grouped, and the corresponding LHP and RHP terms 


= ile= 





of each set are multiplied together. The resulting n terms are oF 18) = 
2 S EC IG. js), and thus when individually factored, yield the first row 
of G(s) cli may be placed in any desired order. 


: ; .th 
Having eeiied first row of G(-s), given by, Gy s), the j element 
of the first row of (s) is,G,(-s) G(s)], and thus G.(s) can be found 
vv ae j 
directly for all j, since the residue equality requirement associates each 
. of,G.(s) with the known set of poles in an element of Gj(- S) 
G(-s -s) G G(s) is then evaluated, and under the restrictions of distinct poles 


of G(s), will equal ® wy) 





As an example, the much-battered veteran of this chapter will 


be resolved, 


_ 3 298411 


(=s+2) (-s+3) (st+2) (s+3)  (-s+2) (-s+3) (s+1) (s+4) 






® is) = 





= Que te1 ~ ieee 47 


(-s+1) (-s+4) (s+2) (s+3)  (-s+1) (-s+4) (s+1) (s+4) 














1 
Deeps -aeeF is ie i i 7 5 
2 (-st2) (-s+3) (st1) (st4) 25 a3 asl s+4 
1 i 
+ 


(-s+3) (s+1) (-s+2) (s+4) 


The poles ats = 3 ands = 2 satisfy separate residue equality 


requirements, lending support to the hope that Q (s) can result from a 





G(s) with distinct elements. 


ed e ~(-s+2) (-s+3) (s+2) (s+3) ~(-s+3) (s+3) (-s+2) (s+2) 
oo a 6 see 1 
Zed 7 ses ea i nS, 


gd vyv,(s) = G)(-s) G(s) + G,o(-s) G(s) 
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G,,(s) = G(s) = 


and the resulting G(s) is given by 





1 1 

G(s) = s+3 s+2 
1 il) 

s+1 s+4 


and G(-s) G"(s) yields the given ® (s). 





But the problem is not yet complete, and this example was pur- 
posely chosen to illustrate a significant defect of this simplified attack. 
As given in section 3.5, Gis) contains a RHF pole and is unrealizable. 
Generally, the resulting solution in this method may or may not be in- 
verse realizable, but its simplicity makes the attempt worthwhile as a 
preliminary to the increasing rigor, generality, and computational 


complexity of the methods given in sections 3.6 and 3.5. 


3.8 Statistical degrees of freedom of a multi-dimensional random process 


Up to this point it has been assumed that ® Ms) is a non-singular 
nxn matrix. If [D_ (s)| = Q, this implies that enecser more rows of a 
hypothesized nxn G(s) is a linear function (not necessarily numerical) 

th 
of the remaining rows. Suppose the k row of G(s), G(s) = = C.(s) « 
g pp (s), G(s) ie (s) 
[S5'S), (k Dm). vs) = beck wis)| » where W(s) | is the hypothe- 
sized transform of the white noise excitation vector over a finite interval. 
MW, We 
Ves) = = Ci) GS), We] = 2 GE) KG) 

Therefore, v,,(s) is a redundant member of the set of signals and 
can contribute no additional statistical information on the multi-variable 
random process. At this point the representation of G(s) as an nxn matrix 
excited by n uncorrelated white noise sources is open to question, since 


there are less than n "useful" outputs. 


=BSe 





Suppose, by striking out pairs of rows and columns, the highest 
order non-singular matrix contained in ® Is) is found. Denote this ma- 


trix as D (s), representing a set of m independent components of the 





set v. It has been shown in this chapter that if physical realizability 

criteria are satisfied D (s) can be factored into enue : G(s) , with G_(s) 
excited by m white noise sources. It appears logical that the remaining 

n - m dependent signals can be derived from these m white noise sources, 


as shown in Figure 3.1. 





Figure 3,1. Formation of a multi-dimensional random process 
with redundant elements. 


The adequacy of this model will be proved in the following steps: 
a Hem!) will be found which satisfies the cross power density spectra 


k i 
produces signals Vie which have the proper cross power density spectra 


relationship between every v, and v.. It will then be shown that this H -m‘*) 


among themselves. Thus every signal will be related as indicated in the 
original ® As) matrix and Fig. 3.1 will indeed be a valid representa- 


tion of a multi-dimensional random process with a matrix D As) of 





rank m., 


To better picture the following steps, ® {s) is shown in parti- 


tioned form. 
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| | | ie: 
Gap = [Gms Dune | 2 (mcs) | leqer/ Homes| 
Drevll Ouie ve ©) Ham CS ) 


(3.13) 
From Eq. 2. 32, 


Pv.v ls) = G78) H'(s) 





H.__(s) = l(-s) P viv, (8) (s) (3, 14) 








ay 
[a (-s)| exists because of its non-singularity. But also, H (s) must 
m n-m 
satisfy the equality. 
D gs 
V 


cYy,(S) = Hts) . H__(s) 





From Eq. 3.14, the following relations must hold for the parti- 
tioned sub-matrices of ® As) 


‘ 
Oy vis = Bvyt- fa Ne]. [a *-o]. Ovy ie 


© v,,.V,(s) o ats) 0) v.v,(s) (3, 15) 














Since ®. (s) is of rank m, each of the last n - m rows can b be 
considered as a linear function of the first m rows. Let Q (s) -= AL; (s) 
0) = 15), where ©, 8) (s) and @ fs) (s) are row vectors of om (s) - A, (s) 


isa iiccicr to as ictexnl eee Writing this equation in , a matrix 





form, and recognizing the resulting partitioned matrices, 
| Fv,vjs) D 14% 8) | = Eo midvivis)| 
Py vis) - @_(s) 


and Oy vis) = A(s) O v.v_(s) 











A(s) = Ov v.(s) GAs) 


-6§5- 





® v,.V,,(s) = @ y vis) O Xs) @ vv (s) 


Thus equation 3.15 is verified, providing that some matrix A(s) 
exists, and the assumed form for H-m'®) produces the observed sta- 


tistics. Note that Hem'‘®) is fixed for a choice of G_{s) . Transposing 


Eq. 3.14, 





a0) 
-_~ 
69) 
— 
i! 


aes 0) v,.v.(-s) [aT -=) 


A(-s) ® (-s) [a7 (-s)’ 





= A(-s) G_{®) G"(-s) [a"(-s)|~* 


= A(-s) G_(s) (3.16) 
m 








For Hem'* to be physically realizable, A(s) must contain only RHP 
poles. A(s) was used as a row transformation to express the redundant 
rows as a function of the independent rows. The elements of A(s) can be 
used in an elementary transformation at the beginning of the factoriza- 


tion problem to eliminate all redundant rows and columns, leaving @ _(s). 





Physically, this means that the random process v is passed through a 


matrix filter B(s), such that the resulting output power density spectra 








matrix | 
-B(-s) O (s) Bs) = | 2m) oats 
be Wilts 
where B(-s) = I : 0 
ie 


That is, B(s) weights and adds together the m independent signals 
of v and nulls out the redundant signals. As discussed in the first para- 
graph of this section, this dependence among signals, if observed ina 
stable random process, must arise in a physically realizable system. 


Therefore, B(s), containing all the elements of A(-s), must be physically 


ee 





realizable or @ As) does not represent a real process. 





Thus an index of randomness of a multi-dimensional random 
process is the rank of the matrix of cross-spectra or, alternately, the 
number of white noise sources needed to reproduce the statistics of the 
process. Also, a set of dependent random processes is physically rea- 
lizable only if the redundant rows of the matrix of power density spectra 


can be removed by a row transformation with RHF pole factors. 
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CHAPTER IV. 


NEW RESULTS IN OPTIMUM SYSTEM THEORY 


4.1 Introduction 


The previous chapters have been in a sense an introduction, although 
a uSeful one, to the main theme of this report. It has been demonstrated 
that a linear system excited by white noise can always be found to dupli- 
cate the basic statistical properties of any stationary random process, 


single or multi-dimensional. 


In standard texts on random processes it is customary to note 
that a power density spectrum has the same form as the spectrum of the 
output of a linear filter excited by white noise. In the early paper by Bode 
and aekinon, which served to convert the highly mathematical approach 
of Wiener into a form more understandable to engineers, this white noise 
filter and its inverse were used as a means to remove all memory from 
the random process and to justify the use of a saints operation to 
obtain the optimum configuration. In this work the idea is carried one 
step further and the hypothesis is offered that within the confines of a 
linear theory a random process should be viewed as actually being the 
result of white noise exciting a linear system. Although this system in 
Some cases cannot be physically represented and the white noise sources 
cannot be traced to microscopic random phenomena, it is possible to 
make measurements on the random process itself with complete mathe- 
matical assurance that there is such a linear system "upstream and 


around the bend". 


This hypothesis would be only of mild interest by itself, but this 
chapter will show how this simple assumption makes the study of sta- 
tionary random processes purely a measurement problem and how it tends 
to unify the conventional analysis techniques of linear systems and those 


of stochastic processes. 


ai 








4,2 Matrix differential equations and system state 


The heart of the description of a linear physical system is its 
"state", which effectively describes the condition of every internal en- 
ergy storage element at every instant. Since a random process is to be 
analyzed in terms of its equivalent system, it is useful at this point to 
summarize the major features of the matrix theory of differential equa- 
tions, such as is found in Belen in order to emphasize the state 
approach to the analysis of linear systems. In this case, matrices allow 
compact expression of ideas without regard to order and dimensionality 
of the system under consideration. The standard theory outlined in this 
section will provide a foundation for clear presentation of the original 


results to be presented in the remainder of this report. 


The basic matrix representation for a linear system is presented 
in the following equation 


eds 
dt 
where x is the n-dimensional state vector of a linear system, Aisa 


x = Ax + Du (4,1) 


constant nxn matrix, Dis a constant nxm matrix, and u is the m-dimen- 
sional excitation vector. For example, consider the simple second-order 
system of Figure 4.1, where a spring-mass-dashpot system is being ex- 


cited by an external force F. 


B 





Fig. 4.1 A simple second-order system 


Here the differential equation is 


max + 3B dx + = F 
a5 : alas 


Defining one state variable, x 


dx 
to be x, and Xo to be ape is 


1’ 
sufficient to fix the potential and kinetic energy of the system. The sys- 


tem equations are next cast into the general form of Eq. 4.1. 
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x, om x O} 
2 | =) + | 
OF 2 M M xe) M 
The initial condition response or free behavior of linear systems 
will be of particular importance in the study of random processes in 
following sections. Given x (0), it is desired to find x (t) under condi- 
tions of no external excitation. It would obviously be desirable to have 


a solution in the form 
x(t) = B(t) x (0) 


Assuming this form and substituting into Eq. 4.1, 





a B (t) x (0) = A B(t) x (0) 
or 
so Bt) See) 
dt 
The series 
3 at nt” 
Bit) = I+ At+A > . wi 720s aaa 
d 2 n poe . 
Where 7 Blt) = A+ A t. . oe ee (nzi)! = A B(t) 
satisfies this equality. 
Thus, = - 
B (t) = Zz ne se where A°2 Lis the desired 
M2oO nN 1 9 


solution and is known as the matrix exponential eeu a quantity that is 
convergent for any value of A andt. It is analogous to the scalar ex- 


ponential, and occupies a position of pivotal importance in linear systems 


analysis. 
If Eq. 4.1 is Laplace transformed, 
s x(s) - x(0) = A x(s) + D u(s) 


x(s) = [s1- A] a) sI-A]~ D u(s) 
- (4, 2) 
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The eigenvalues of the matrix A are thus the pole locations of 
the transform of the transient response. If, for example, A has only 


diagonal elements A. 


4 1 x. (0) 
fra]? = [p43 Fy] ane ago 


i 
r,t 
x(t) = x.(0) on (4, 3) 
In this case, the state variables refer to a system which, in La- 
place transform terms, has been expanded by partial fractions into a 
series of simple poles. There is no unique state of a system, since any 
non-singular linear transformation can be made on a particular set of x. 


Ifx ST y, substituting in Eq. 4.1 yields 


d : 

Ta Y= AtTy 
d weet 
a ee 


A transformation on A, where 7" * A T becomes a diagonal ma- 
trix, is always possible if A has distinct eieeiwalues”-. In this case, 


the general solution for a free system is, from Eq. 4.3 
y(t) = fe os 5, y (0) 
re x(t) = rc J. | ee x(0) 
x(t) = T Pex ,,| ie x(0) 


Therefore, 


Wii Fas 5] al 


for any A which has distinct eigenvalues . and which is reduced to diag- 


onal form by T. In the general case, from Eq. 4.2 
Saas fst-a]™ (4, 4) 


An alternate way to visualize the concept of state is to integrate 


and Laplace transform the basic equation, Eq. 4.1, yielding 
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2a = ah) = Ae = D us) (4.5) 


This expression with integrals immediately yields a form suitable 
for direct méchanization on an analog computer. The number of integrators 
required would equal the dimension of x. The input to the ith integrator 
would be ie wn 
a Ai; XG) + 2 dsj UG) 
and its output would be the ith state variable of the system x.(s). Relating 


a state to an output of an integrator lends a particularly clear meaning to 


this concept. 


In summary, the state of a system is the set of numbers which at 
every instant is sufficient to define the signal level in every energy storage 
element. In a linear system which is not externally driven, the state tra- 
jectory is given by 


Hi) = 6 “eee (4. 6) 


4.3 Interpretation of the optimum linear predictor 


The mathematical form for a linear predictor, optimum in the 
mean square sense, was one of the first significant results in random 
process theory, as presented by Wiewen and Ai teoser This section 
will show that this predictor has a very simple interpretation in terms of 
the generating model for the process. For generality, the multi-dimen- 
sional case will be discussed, which of course includes the scalar or 1xl 


problem. 


Chapter 3 has shown that a random process can always be viewed 
as a generating matrix G(s), excited by a set of unit-valued uncorrelated 
white noise spectra. The optimum predictor for T seconds in the future 


in a random process is given by 


w_(s) =| ois) |™ Sz Hottes B ()} (2. 28) 
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where 


© (s) = © .e I = e® ™ Gs) @eta) (2. 33) 
vi Vv ee NE aa as 


After transposing, and substituting in Eq. 2.28 
zs -l 
W (s) =Ax { st G(s) } G (s) (4, 7) 


Figure 4.2 shows the resulting structure. The input white noise 


vector w is successively transformed into the given random process v, 


bi Ned 
we) Tee] we Toa] wt) [ir eel ve) 


Fig. 4.2 Configuration of an optimum multi-dimensional predictor 
back to the original white noise, which then passes through a system 
-1 a 
given by Zk fe G(s)t : 
Suppose first, for simplicity, that the ijth element of G(s) con- 


tains only simple poles. 


Ns k. 
Gy\s).=-eaemmgeneamas 
1j Az] Ss + a. 


This element can be portrayed in flow graph form, as shown in 


Figure 4.3. 


INPUT 





Fig. 4.3 Typical transmissions of the ijth element of G(s). 


The set of values for x. completely defines the state of the system 
and if white noise excitation should suddenly be cut off at t = 0, x. (t) 


would equal x,(0) ae 


The ijth element of ek {e8? G(s) } is then 
ne am. -A4:T 
14°42 hut “4 = 2-dhiiScmmm 


Ava S+ 4; a= St Ca 


ae 





A flow graph of this system is given in Figure 4. 4. 
x 


| 
NPUT OUTPUT 


eae 





AN 


= ee 
ses Ss va 


Fig. 4.4 The ijth element of of f eas) 


A very significant interpretation can be made from comparison 
of Fig. 4.3 and Fig. 4.4, 2 and 44" 4e5T a, (s)} BR e IAS 
excitation, and continually reproduce the same state variables, x, . The 
difference is that the output from each first-order system which helps 
to form the prediction for v(t +T) is weighted by the value of the unit 
initial condition response in7T of its own system. That is, just as the 
present value of V. is a linear numerical function of the state variables, 
so is the optimum predictor the same linear function of these state var- 


iables after an initial condition decay of 7 seconds. 


More generally, the best prediction in a mean-square sense of 
the state of the random process T seconds in the future is the initial 
condition response of the generating system from this state. Upon re- 
flection, this seems to be a reasonable result when one views the state 
at time t +T as the sum of the initial condition response from the state 
at time t and the results of white noise excitation from time t tot +T, 


the latter being essentially a zero-mean unknowable response. 


The above demonstration included only the case of simple poles. 
As a may contain multiple-poles, it is necessary to verify the decay 
of the state as contributing to the optimum predictor for this case. In 
all of linear transient analysis, the case of multiple poles is one handled 


with considerable difficulty. In the following proof, a canonic flow-graph 
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configuration will be postulated for a repeated pole transmission. The 
contribution to the optimum predictor will first be found using the 
straight-forward jet fe&"a, (s)} expression from Figure 4.2. Then, 
the expression obtained by ee each state variable of oe and 
allowing each to decay as an initial condition will be found and manipul- 


ated into the same form. 


Figure 4.5 shows a canonical configuration for a parallel trans - 


mission of Coe involving m cascaded poles ats = - a. This form has 





V4 
= 


Fig. 4.5 Canonical form for a transmission involving multiple- 
order poles 


internal node variables which are the system state variables. 


The transmission from the state variable, -: to the output is 


given by the recurrence relation 





TAs) = T,(s)) (7 20) 


where k = O f 
O 


Iterating this relation yields after simplification 


r 2 ae R R S$ TT k 
1) aren (S+a)* (s+ay> ¥ aa) —_ 


which is also seen by inspection by tracing the paths from node j to the 





output in Figure 4. 4. 


The transmission from the input node to the output includes all 
the repeated pole terms in the partial fraction expansion of Soe and 


is given by 


PSs 





= <- 
— 





Te 


An 


(S$) = Ran Tm (S) = o Lh Remake 

S A=0 (See 

It is hypothesized that the contribution of this repeated pole 
transmission of See! to the optimum prediction of the ith variable is 
given by the sum of each of its state variables allowed to decay as ini- 
tial conditions for 7/seconds. Thus the cascaded system of Figure 4. 2, 
which operates on the "recovered" white noise ee should supply a trans- 
mission from We to the rhode, | weight by the numerical factor of 
the unit initial condition response int from node r, and sum over all r. 
This system must then be equivalent to the result of applying the known 
solution be a G.(s)} for this multi-pole leg. 


The unit initial condition response 1(s) from node r to output V; 
1 
of Figure 4.4 is given by applying a unit step, amr to the rth node, yield- 
ing 
$ Fu 
ae) = = lee) = agi 2 ey | ies Rrod 
eee k ae CS 
The inverse Laplace transform of 1(s) is the desired weighting 
for the rth state variable as a function of T. 
ae = es 
; -AaAT7T A 
A,r) =e SL, iS Rr-4 T € 
Az} ay 
A: 


The transmission from input to node ris 
m~-r 


EL, Red 


S+ 6 al 
Therefore, the repeated pole part of the hypothesized optimum 





predictor is 


r=) (StA)™T FT 


which should equal the known result 


Asi * ie Sasi 
tafe" Ta 6)} = hee 2 ae 





ie 





quantity £2 z { _87 T, ff will now be manipulated into the form of Eq.4.8. 
fa 6. ae (5 $ ee fae 
= ei { S Ue Rea (tea) eg & (4+) t 


If 
Qe 
a 

[MA 
oan 
ro” 
Pom 
oo 
Te 
Nvt- 
~ 
be ee 
Gy 
an 
a7 
(dS 
Q 
Ny 
LY~ 





where ( : ) is the binomial coefficient, (3 


2) lea 
= Ran —* pe 
me an = rane: ~~ J 
= o Z SS ae Tage 
Expressing this series in terms of powers of - —— where j=i-p 


m=] wa] = 
Me Gar a Iho Bet pace = 
P=o (S+a) Az=P U~-p) | 
Replacing i by m-r+u and p by m-r, 


y= l Y~-! mt 
rd Gayo { 2 diy Rmh na ee 
ts: MA 


[= ALA=O / 
an ye r-) | 
= am — + 
= BE fh, Rm-g > TT Ryg EFT 
=} é A=m-re | 
nw (Sues oe pO AX : AA 
“Sime | ees | Sais 
2 Z Ww ey | e4r 4 = ve J tH ae 
r= | St+a)™ r+} aa] AA / 


which is equivalent to Eq. 4.8, completing the proof. 


A more elegant proof can be made with the aid of relations de- 
veloped in Section 4.2, where G(s) is considered a general system with a 


set of state variables, x, described by the matrix differential equation 


d 


q * => Ax + Dw (4.1) 


and where the output v is given by v = Rx. 


From Eq. 4.2, the input to output transfer function is implicitly 


given by 
v= R ole 


It is desired to prove that 


Bar 





AZ fa {e87R eee Dt} = R eAThs1-al” D 


which means that w is operated on by [s i= ae D to produce the current 
state variable vector, x(s), which is then weighted by its initial condition 


decay = ene reproduced at the output by R. 


a = as 
IZ {STR [st-Al] "Dt = x { hoe ue he 
i 
- & {R eT Atyt 
: 20 
according to a property of the matrix exponential proved by Bellman 


And, completing the proof, which applies for single and multiple 


roots alike in single and multi-dimensional systems, 
A4R a e* p} = R ef (ere Ne D 


In sharp contrast to the arduous multiple-pole derivation made 
above with involved manipulations with series, the use of the general 


state equation provided the desired results with a minimum of effort. 


Thus it has been proven that the optimum linear predictor for a 
stationary random process can be regarded in all cases as the result of 
computing the state of the random process and allowing these state 


variables to decay as initial conditions in the given model of the process. 


The significant feature of a random process is then its state, 
which summarizes for use in the present and for future prediction all 
past behavior of the random signal or signals, using a compact number 
of variables. An expected trajectory of the state variables of the random 
process, and any system on which it may act, is then defined at every 
instant by these state variables just as a free determinate system settling 
to equilibrium is defined by its state variables at a single instant. This 
allows a wealth of known information concerning the behavior of unforced 
linear systems to become applicable to systems which are driven by ran- 
dom processes, especially in control applications, Chapter 5 will ela- 


borate on this interesting by-product of the new approach to the repre- 


~ "em 





sentation of random processes by the state concept. 


The concept of state is only useful if the state variables are re- 
coverable from operations on the random process alone. If the matrix 
model, G(s), has a realizable inverse, which accounted for most of the 
difficulty of Chapter 3, this is obviously necessary and sufficient in 
order to ensure that the state variables can be separately found by a 


stable system. 


Having found that the future value of a random process is given 
by (1) the sum of present state values decaying as initial conditions, and 
(2) the response of an "empty" system to future values of white noise, it 
is now interesting to investigate the knowable properties of this white 


noise buildup. 


The error of the optimum single-dimensional predictor will be 


solely due to future white noise excitation. Figure 4.6 shows this optimum 





configuration. 
. - ati ae il 
—4 oe) | S| _ a ie} 









Zor 


Fig.4.6 Error configuration for an optimum single-dimensional 
predictor 


The transmission from w toe is 
H(s) = G(s) a - TR “aah: fi G(s) § 
Suppose that the es response of G(s) is w(t). 
H(s) = at fey eS at 
2. (8) = H(s) H(-s) = J mye 1 dt fog 


- if. s(to-t,) 
2 s(t.-t 
a inj ds ee®) = ie dt fe dt wt, ) w(t) ori i | e 2 1] 


==) —~ joo 


assuming that the order of the integrations may be changed. But 
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Jeo 
1 s(t. -t 
27) S : 
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DS a) 


] 
— joo 


where Ad (t) is the unit impulse at t = 0. 

a basal 
if dts wt) i dt, wt.) M(t, -t,) 
o oO 


iP 2 
J dt w(t) (4.9) 
oO 





eH) 





e4(7) 


This general result indicates that the mean-square value of signal 
level at the output of a linear system, when the white noise is suddenly 
turned on at t = 0, is equal to the integral of the square of the impulse 


response from the excitation point to the output. As a check, 
of° 


oO J 
Sls = fat ve) = = f ds G(-s) G(s) 
27] 


Jeo ae 





Jeo 

Obviously, if more than one uncorrelated white noise source is 
driving a system, the resulting variance of an output signal is equal to 
the sum of the variances from each excitation point considered separate- 


ly. The next section will use this result to motivate a quantitative replace- 


ment for the Nyquist sampling theorem. 


4.4 A quantitative measure of sampling error for non-bandwidth limited signals 


A classic problem in numerical analysis, pulse code modulation, 
and sampled-data control systems is the loss of information because of 
representing a continuous signal by a series of evenly-spaced samples. 
The conventional approach is to utilize the so-called Nyquist Sampling 


Theorem as given, for example, by Ragazinni and Bronk ince 


, which 
states in essence that a signal of absolute bandwidth Jt can be recovered 
if T, the sampling interval, is less than — 


In practice, since absolutely bandwidth-limited signals do not occur 


Zoe 








in a random process, it is customary to apply a liberal factor of safety 
on the Sampling Theorem rate for the approximate signal bandwidth. 

This section will discuss a more basic and quantitative approach which 
considers the actual average mean-square error inherent in the sampling 


operation. 


Suppose, for convenience, that the continuous random process 
is generated in the canonic models of Section 4.3, and sampled at the 
output. At every sampling instant each state variable is summed to form 
the output. The changes in the state variables at successive sample times 
arise from two separate effects: (1) The state variables decay as initial 


conditions for T seconds, and (2) White noise builds up for T seconds. 


It is natural to postulate a discrete generating model for the pro- 
cess which has the same state variables as the continuous model at the 
sampling instants, and whose discrete transition is equivalent to T sec- 
onds of continuous initial condition decay. The discrete excitation of each 
state variable is then a random uncorrelated string of pulses which has the 
Same mean square value as T seconds of white noise buildup to the partic- 
ular node. In example, suppose a random process is generated as shown 


in Figure 4.7. 


a 
LJ 
= | “Jn @,-(s) - 


nen 
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(sta) (- Sta.) 


oF | 


Fig. 4.7 A simple random process generating model 


The unit decay of the state variable during a sampling interval is 
Meee che white noise buildup is siveniby 


2 —— a -at 2 _ k? -2aT 

OC = J dt w (t) = | dt (ke ) = Da (l-e ) 
o 6 

Figure 4.8 shows the discrete model which creates a random 


process which is hypothesized to produce the same statistics as the sam- 


pled process of Fig. 4.7. Here 2 (- se is a unit delay operator. 


=O 








(w AS Ww R LA ies | Vat 
spate. 
Fig. 4.8 Discrete model derived from Fig. 4.7 
; ai -sT 
The power density spectrum of a realizing that z=e ) 
2 
at k -2aT 1 1 
D gaya?) ; Q na?) 2a (Le MG 4 eree ) * (1 ere ) 
where @ » {z) = 1. (See Appendix II. ) 
Considering Fig. 4.7, 
2 
YY (T) = ae _. Ppt 
VV 2a 9 
Ee _* -a | "IT 
i ves Ca 2a : 
D(a) = | + ee?! a 
vtve 2a l+e“B" z i+e-SF 2 


, Ke (ine — 
Za (l+e-*7?z)(1l+e-&F gm!) 

This example has illustrated the relation between discrete and 
continuous models for random processes, showing that the same dis- 
crete power density spectrum is obtained from considering either white- 
noise buildup over the sampling interval or through straightforward z- 


transform techniques. 


The best estimate of the continuous variable v(nT+t) from its 
samples v(nT) is v(nT) a for t<T since the future effect of the white 
noise cannot be predicted. In the general case, the best estimate has 
the current state variables decaying as initial conditions until the next 
set of state variables is computed. In analogy to the continuous case, 

a suitable inverse filter can always be found to recover these state 


variables if the continuous model is inverse realizable. 


The reconstructed error of the random process is the difference 
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between the actual value between sampling instants and the initial con- 
dition decay -- in other words, the amount of white noise buildup at 
the output of the generating model over the sampling interval. This 
irreducible error is the fundamental penalty for representing a random 


process in terms of its samples. 


The result of this discussion is that, from ao 4.9, the mean 
Square error between sampling intervals is Ff dt w a(t), “ Ba) is 


(4). 


the model impulse response. The average error is thus = 7 Jarfe dt w 
It is now proposed that a useful measure of the error due a *sampling 
is the fractional error power, or the ratio of the mean square error 


to the mean square signal level 


roof 
F.E.P, 4 oa far fa w(t) (4, 10) 


O Vy 
fat a F 
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This provides a quantitative measure of the inherent penalty for 
sampling any random process, regardless of the spectrum shape. An 


example will illustrate the utility of this approach. 


Suppose the continuous model for an observed random process, 


v, is given by 


1 
G(s) = (x3) (s+4)_ 
oxaye= a St ett 
~ 1 a K 2 1 2 7 FE 
e = F far {awe = 4, + ae (1 e ) 
0 O 

1 -6T 1 -8T 
- a¢q_7 1 -e )- Bar (l -e ) 


In this form it is difficult to obtain the average square error for 
small T, and especially to solve for a T to meet a certain fraction of the 
mean Square Signal level. An alternate route is to expand G(s) in ascend- 


1 
ing powers of a 
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i 7 
G(s) = ae = — - —p 
—- 2 3 
i 12 S S 
l1+—+-—-— 
S 2 
S 
7 7 2 
wet) = ¢t 5 t 
Wo) = te ee 
= is 
1 
Re = “TF { at ees = r a An jee 
12 20 
FEP sialon Ge =- T° = 7 TC 
ten 1 20 3 4 
————. cua = 14 20 - 5822 7 ie 


8 
If the FEP is specified to be .01, an approximate value for T is 
given by 


meee (2 1/3 = .089 Seconds 


This section has used the concept of white-noise buildup (1) to 
show the mechanism by which sampling of a random process always de- 
grades knowledge of the signal, and (2) to present a quantitative measure 
of this error from which a rational decision can be made for a proper 


sampling interval. 


4.5 New results and interpretations for the optimum filtering problem 


A physical system which operates on a given random process 
can be viewed as a means of continuously extracting all possible informa- 
tion about future values of error from present values of input signals. 
An optimum system should result in an error signal e which is on the 
average unpredictable from and unrelated to past values of input signal 
v. In a linear statistical theory, this lack of relation can only be measured 


by a correlation function, which means that 
E 1% e-Tr ein} = Pret =0 (720) 


eq] 12. . 9) 
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for a random process with n inputs, under this requirement. Accordingly, 


L] Yne,o | = te 4 Prev = © 


To 





But e(s) | - i(s) | — Ws) v(s) | where W(s) is the optimum 
system to be found, and i( s) is the desired output vector. From Eq. 2.32, 


Bs) = Sts) - Fs) Wis) 


en me ey 





Therefore, 
ee {PAs wis) f = dp "LB sr} (2. 18) 


which is an implicit statement of the optimum multi-dimensional system, 








which was obtained with considerable more difficulty (and perhaps more 


rigor) in Chapter 2 by an alternate route. 


By either method, the basic statement of optimality of realizable 


linear systems is then 


mie i {© 0] = (4,11) 


This result will be used to motivate a closer look at the prop- 
erties of optimum single-dimensional systems. In particular, the filter- 
ing problem will be examined and an optimum unity feedback system 
will be derived which takes advantage of some not readily apparent prop- 
erties of the standard mathematical solution given by 

W (s) = ue is) dt “1 Of) \ (2a) 
wv BV) 

Figure 4.9 shows the basic configuration to be examined. The 


following restrictions apply: (1) The signal s is derived from unit density 





Figure 4.9 The basic filtering problem 
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white noise passing through a linear system G(s), (2) The noise nis 
derived from unit density white noise, uncorrelated with the signal 
white noise, passing through a linear system G(s), (3) The signal s 
is the desired quantity to be reproduced at the output of W(s), and (4) 
W(s) is to be a unity feedback system, with forward transference H(s), 


such that W(s) = Fate 


This model is of sufficient generality to include many filte ring 
and control problems of practical interest, and its solution will later 


motivate a completely general solution. 


From Eq. 2.9, 


® (s) 


ve 


® (s) [1- ws] - S_ (s) wis) 


I, H(s) 
© se!) 1 + H(s) J PD nn'®) 1 + H(s) 


Hence, from the basic equation, Eq. 4.11, 


-1 1 ” -1 H(s) 
74° 5,09 oa} - 24" {8,09 BQ} ae 


Two very important facts are revealed from this equality. Since 
the positive poles of @ _.(s) do not generally equal the positive poles of 
0) an’ S)2 this equation will ony hold in general when (1) the poles of H(s), 
which are the zeroes of tts) include alli the me poles of Q .. 
and (2) the zeroes of H(s), which are the zeroes of TF Hi) ° 


all the positive poles of O. nis): If this were not so, then in the te! 


<—_ 


partial fraction expansion of both sides, there could not be pole-by-pole 
a 

equality. Let Mi (5) 

H(s) = ase H (s) 

where N,(s) a and S(s) are the LHP poles of Q tS) and Q. 38), respec- 


tively, and H Pies is an additional term which does not eo any of the 


Signal or noise pole terms. 


The optimum system is, from Eq. 2.15, 


a5 





ae 





Sp (S) Sp) Np) (S) { 24 '{ Sat) ial cae U(s) } 
Asn Os) Sp) 


where V_(s) equals the LHP zeroes of @ (s). 


. H(s) 
Equating this to 1+ Hs) i and solving, 
hy U(s) 
2 F(s) - NFU(s) 


Although this is not obviale by inspection, the polynomial S as) 
must be a factor of V_ 7 (5) - N 4s) U(s) in order that Eq. 4.12 be oe 


according to previous — 


| Ue) fe 4, Sig? } 


L_ Oy, SS 


H(s) = Vz (5) Us) Ss SHG } See c 
Nes) Spy Spo) Brr(s) - hd DCS) 








Ss 
-l 
This leads to the interesting conclusion that {4 ot. zis ay} 


which contains only the signal poles, is equal in this case to san sum of 
signal — in a partial fraction expansion of om wry 5) since no cancella- 
tion of S n(s) is allowed. A more general proof ane this important identity 


will be ae later in this section. 


Therefore, 
; a 
H(s) = S Signal Poles of Q (8) a S(s) (4. 13) 
S Noise Poles of @ *(s) N(s) 
_ 1 ee 
Ws) = TyHG) => *S(s) + Nts) —_ 


This result is of considerable practical and theoretical interest 
and applies to all single-dimensional filtering problems, when noise 
and signal are uncorrelated. The optimum system determined, W(s), 


has the following significance: 


The best estimate of an input signal under a mean Square error 
criterion is that the signal originated from signal poles of a single system, 


with transfer function om (8) and excited by unit-density white noise. 


=e 








The optimum system then merely determines and sums the canonic state 
variables of the signal portion of the random process generating model. 
The optimum predictor in this noisy case is intuitively the result of allow- 
ing these instantaneous state variables to decay as initial conditions 

for the desired 7 seconds. This is verified by noting that, where the sig- 


- k 
nal poles of the generating model are IA ; al =e i , the op- 


timum predictor is given by Oy) A 3 tha 
1 all Sy -KT 
o*,(s) LK D ..(s) ‘ = wt ’ Se 
a Gres) J Ow “SHR: 


which computes and weights state variables for T seconds of initial con- 


dition decay. 


The above simple interpretation of an optimum system was ob- 
tained through rather a roundabout method, and holds only for uncorre- 
lated signal and noise and a one-dimensional random process. But having 
this result, it becomes simple to extend it to the general multi-dimension- 
al filtering and prediction problem with all possible correlations existing 


between signals and noise. 


The basic equation defining the optimum multi-dimensional system 


is the transpose of Eq. 2. 28 


Ws) -f47! { oT [a7 (-s) 1 G(s) (2, 28) 


Also 





6 "(s) = G 5  ( G "Gs (2.33 
vit®) 7 ap ss 8) is qe) ; J {3d 238) 








The perfect operation on the signal, G ,{s), is I for the filtering 





problem. Thus, 
2 is i a5 z . 
wis) =7¢'4Dw Let-s)]* +66) [a7-s)] it G(s) 


In analogy with the simple case discussed earlier, it is desired 








now to prove the ag term is merely the result of expanding each element 


of G(s) in partial fractions and retaining only those with signal poles. 


aoe 





First it is necessary to identify the signal poles. Figure 4.10 


shows a multi-dimensional model for the formation of correlated signals 


and noise. 








WHITE 
NOISE Vt 


Fig. 4.10 A hypothetical model for the creation of correlated 
Signals and noise 


Given the auto and cross power density spectra of the signal and 
noise vectors, where the noise vector can be of less dimension than the 
signal, a (n+m) x (n+m) realizable and inverse realizable matrix filter 


G_fs) can always be found which can reproduce the observed statistics 





of the separate signal and noise components. The poles of the ith signal 


are the poles of the ith row of G_ fs). 


The matrix of power density spectra of the signal and noise signals 


is given by Gots) G(s) which in partitioned form is 
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The positive poles of the ith row of G_nt®) appear only in the ith 
column of the above partitioned matrix. That is, the positive signal poles 


appear only in the sub-matrices QD _(s) and O .(s), and the positive 





noise poles only appear in ® _(s) and Q (8) : 


Considering the observed random process v, where v=Ss+n 


and zero elements are permissible inn 





Ds) = DH +9o+ Oe + ® 5) (2. 31) 
Groye= Gl-s) G(s) (2. 27) 
VV 
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T T -1 T te ac -]} T 
G(s) {Be + Oy," E (-s) | {55,8 + D (s)}|c (-8) | 


But G(s) has no RHF poles. 


d4 ‘{ols}= ats) = 72" (87,8) + Die] [ets] 
da {[G.,(0 + 6" (s)| [o7k-s)] 7} 


Since the first and second bracketed terms above have only positive 
Signal and noise poles, respectively, they are immediately identified as 
the separate signal and noise terms in a partial fraction expansion of G(s), 
which is the desired proof. 

Let G(s) = S(s) + N(s), where all the signal and noise poles are 
grouped together in S(s) and N(s), respectively. Of course, if one or 
more signal poles are identical to a noise pole, the contribution of these 
signal poles to S(s) would ie obtained through = separate partial frac- 
tion expansion inZt (cme a s) D (a). seu ae) ) (s) since 

ss ns 


they could not be separated in a partial fraction expansion of G(s). The 








optimum filter is then, from Eq. 2. 28, 


wis) = S(s)  [S(s) + Ns)] ~ (4, 16) 





A unity feedback system is readily seen to have a forward loop 


transmission 


His) = S(s) Nis) ie (4. 17) 





Fig. 4.11 A canonic optimal multi-dimensional filter 


Fig. 4.11 is invalid if N- No(s) is singular, which would be the 
case if one or more of the input signals is uncorrupted by noise. In this 


case, the canonic configuration of Fig. 4.12 is still applicable, providing 
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the trivial restriction of signal having to be present in all input compo- 


nents of v is satisfied. 





Fig. 4.12 An alternate optimal multi-dimensional filter 


These optimal configurations have an interesting interpretation 
as systems which compute inner signal levels of an effective random pro- 
cess generating model, G(s) = S(s) + N(s). As shown in Figure 4. 13, 
the optimal configurations merely act to reproduce quantities which exist 


at the input and outputs of the signal and noise portions of the model. 





Fig. 4.13 Signal reproduction in optimum configuration 


Kalman and pucuae recently presented an approach to the optimum 
filtering problem which considered the special case of pure white noise 
corrupting all input signals, with no cross-correlation between signal 
or noise. They postulated a model of the original signal generating model 
which appeared in the forward path of a unity feedback system. In the 
light of the above analysis, it is easy to see why they were unable to ex- 
tend their results, since, from Fig. 4.11, the model which should have 
been specified is the signal generating portion S(s) of the hypothetical 


model G(s) -- which creates the actual signal observed and not the pure 


=O 








Signal component. 


The results of this section are particularly important, both in 
understanding and in operating on random processes with linear systems. 
In essence, it has been shown that the physical system found from factor- 
ing a matrix of input power density spectra contains in its signal levels 
all the knowable information about the random process which can be ob- 
tained by linear measurement of the random process. The optimum system 
has the simple form S(s) [S(s) - N(s) | ao where S(s) contains all the signal 
poles (positive poles of D _.(s) and Q .(s) ) in a partial fraction expansion-- 
element by element -- of G(s), the effective generating system. Figures 


4.11 and 4.12 show canonic forms for optimum feedback systems to filter 


the multi-dimensional random process. 


4.6 Correlation functions and initial condition responses 


Auto and cross-correlation functions have an appearance similar 
to the dynamic behavior of linear systems, usually decaying to zero ex- 
ponentially asT becomes very large. This section will relate the corre- 
lation functions to the initial condition response of the white-noise driv- 
en linear model for the random process with an equation of considerable 


Simplicity and generality. 


First, suppose that the cross-correlation function fx,x(T) is 


known between two state variables, x, and 2 that are defined in a linear 


i 
System by the general equation 


d 


ae x = Ax + Dw (4.1) 


where x is the n-dimensional state vector, and wis a r-dimensional 


white noise vector. 
since from Eq. 2.9, 
0) a ake) = S P x.x(s) 
- _ 4d 7 . 
Px k(1) = <. Px (7) = E{x(t). se (t+7 )} 
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=r x(t) ya 


| a w,(t) 


“WA we 
ae = wa So E {x,(t) ; x (+7) } + ra Fe E x, (t) w (tT) } 


E Ax(t). witeT)} = 0 (~ > 0) 


since future values of white noise are not causally related (ie: correlated) 
to present values of system signal level (or, more formally, since D x. w,(s) 


contains only RHP poles). Thus, 








MWe 
d : 
ap of xx(h = 2 an Fx (7) (T70) 
Writing this equation in matrix notation, 
d 2 T 
av YT) : Cae a Care 
Transforming, 


et YG 0} ~~ @) -dt7{ 6 1s)} At 
{#71 T,() = Yo [ts-at] 7 


But, from Eq. 4.4 


fas 6) - Yih 








4 AT) T 
Pun = 9 an feet (T70) 
Transposing, 
Tt AC een 
Peet = ‘Cocielle B 2) (4, 18) 





This is the desired general relationship, which shows that the an 
correlations between state variables in a linear system are mapped 
through time by the same transformation that governs the decay of the 


state variables in the linear model: x(t) = e* : x (0) 


Now, it remains to use this result in order to show the meaning 


of the correlation functions which would be measured at the outputs or 
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output of the random process. Suppose that the r-fold output vector v 


is obtained through multiplication of the state vector x by a rxn matrix R 
WwW 


v(t) = aia ne Blt) 


= 
The cross-correlation function of two output signals is thus 
nr. iA 
= r 
byiv (7) Z i momges 


Or in matrix notation 


PMT) = R YIM) OR 


T 





For T70 , using Eq. 4.18 


Yr) = R Yo) [e*™]* R? 




















Transposing, 
T tL 
pin = R ete R Ya(0) ] (T70) 
Since Yvx(7) : 2 Vax Px, X(T) or 
mt) = es 
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This equation is in proper form to permit interpretation of the 
output correlation functions. The initial condition response of the system, 


viewed at the output, is 


ce a Oe 


Therefore, if the vector x(0) is set equal in the model to fxv (0) = 
tv. x(0) then the transient observed at the jth output terminal will be Pav t Me 
In words, this means that the cross (or auto) correlation function, PvivT) 
between two signals in a random process is the transient which would be 


observed at the jth signal location when each of the system state variables, 


x,» is initially set to x, v.(0) and the system released. 
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This result tends (1) to re-emphasize the basic nature of the 
hypothetical model which is capable of generating a given random pro- 
cess, and (2) to interpret the correlation function as a transient of this 


model. 


4.7 Advantages of the state and model approach to random processes 


This chapter has been written in the hope of altering current ways 
of approaching the visualization and study of random processes by the pre- 
sentation of a simple explanation for the mathematically-complex results 
of contemporary theory. In a sense, the basic question is whether one 
should look at what a system does or whether one should look at what a 


system is. 


It was necessary to first ensure that such a system can always 
be found from auto and cross-correlation functions of a multi-dimensional 
random process. This was the contribution of Chapter 3. With this assur- 
ance, the conventional Wiener theory could be reworked with complete 


generality. 


Section 4.3 considered the optimum predictor configuration. It 
was shown that this problem is only a matter of continuously measuring 
the state variables and weighting them by their initial condition decay 


for T seconds. 


Section 4.5 dealt with the problem of filtering extraneous noise 
from a desired signal. In this case it was shown that the equivalent gen- 
erating model was actually two systems in parallel, one associated with 
the signal and the other with noise. The optimum filter merely computed 
the output of the signal portion. With the recognition of this simple inter- 
pretation, two general canonic feedback arrangements were found which 


Should be of considerable interest in control systems design. 


In section 4.4 a quantitative measure of error due to sampling of 


a random process was presented. This was determined from the buildup 


2OG= 





of white noise between sampling instants in the model. 


Section 4. 6 showed that correlation functions can be regarded 
as transient behavior of the effective model under certain initial con- 


ditions. 


In all these results, the ideas of white-noise excited system 
and system state play the dominant role. ''State'' and "system" are far 
more general terms, however, then their use here would indicate. It is 
interesting to conjecture at this point how these concepts might aid the 


study of non-stationary and non-linear random processes, 


First, in the case of non-stationary random processes it seems 
highly probable that the conceptual results derived in this chapter remain 
valid, providing that the effective linear time-varying model for the gen- 
eration of the process is known or can be found. The optimum predictor 
could still neglect future values of white noise and uSe only present values 
of system state, but of course in this non-stationary case the initial con- 
dition decay would no longer be described with the matrix exponential. 
Also, the case of finding a time-varying inverse of the effective generat- 
ing model in order to recover the state variables appears possible if ex- 
tremely difficult. Further promise in this respect is lent by recent work 
by Kalman and ener who have derived an optimum time-varying system 


which remains similar in form to the stationary case. 


In the case of so-called non-linear random processes, which are 
distinguished by decidedly non-Gaussian probability distributions, itis 
appealing to hypothesize that they occur as the result of independent white 
noise driving a Suitable non-linear system. Further, from current work 
in this field, for example by nevee it appears possible that such a non- 
linear system might be a finite-state linear system driving a memory- 
less non-linear function generator. This is an interesting alternate ap- 
proach to the study of non-linear random processes which is more ap- 


pealing to the engineer than the more general and highly-mathematical 
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, 26 
treatment of, for example, Wiener 


In short, it is hoped that the simple physical interpretation of the 
optimum linear systems presented in this chapter for a stationary random 
process will motivate a similar approach to more complex stochastic 


problems. 
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CHAPTER V. 


RANDOM PROCESSES AND AUTOMATIC CONTROL 


5.1 Introduction 


Stationary random processes have been examined in the previous 
chapters with an eye toward delineating the recoverable information which 
exists as a result of optimum linear operations on the signals. The con- 
cept of a generating model, excited by white noise and possessing state 
variables, has been shown to be a particularly effective way to visualize 
the action of optimum systems -- that they perform essentially a measure- 


ment or signal recovery of certain quantities in the generating model. 


The time has now come, however, to consider how this increased 
intuitive understanding of random processes can be of help when control 
decisions must be formulated as a result of the information received. 
The general control problem is of great interest to mathematicians and 
engineers alike, and most significant control problems involve signals, 
wanted and unwanted, which are random in nature. In this chapter we 


restrict attention to the following situation: 


A fixed linear system exists whose output is to be forced to follow 
a stationary random input signal, which in the limiting case of a regulator 
is constant. Corruption of the command signal with noise is allowable. 
Also, load disturbances may be present which are stationary random pro- 
cesses uncorrelated with the input signals. Finally, the controller con- 
figuration is completely arbitrary as to the possible use of linear and 
non-linear elements, with the single important limitation that the con- 
troller output signal which drives the fixed system be limited in ampli- 


tude to correspond to the saturation level existing in the controlled system. 


section 5.2 considers the scalar problem and develops a design 


philosophy which appears to have considerable promise in the optimum 
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control of saturating systems. The particular problem of load disturbance 
in linear and saturating systems is treated in Section 5.3. With this foun- 
dation, contemporary approaches to full-throw control which can be found 
in the literature are critically analyzed in Section 5.4. Finally, Section 
5.5 presents an extension to the determinate Second Method of Lyapunov 
to include random processes. This leads to a design procedure suitable 
for a multi-dimensional saturating control system, optimizing a quadratic 


error criterion. 


In the past chapters general equations, simple proofs, and sweep- 
ing statements could be presented with mathematical aplomb because 
of the simplicity and power of linear methods of analysis. But in this 
chapter the spectre of saturation has arisen to confound our linear theory 
and the whole tenor of this thesis must change. No longer can general 
quantitative statements be made concerning system behavior; it is diffi- 
cult enough to make useful qualitative observations. We must be content 
with small nibbles at this frontier of control theory and recognize that 
the verification of original ideas can only come with computer analysis 


and can only be valid for the specific cases investigated. 


5.2 Saturation and control in a stochastic environment 


It is profitable to consider again the optimum unity feedback con- 
figuration derived in section 4.5 for the recovery of one-dimensional 
signal from noise. This is shown in Fig. 5.1 where the input signal v 


is composed of two hypothetical components, signal s* and noise n*, which 





Fig. 5.1 Optimum filter configuration 
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are the best estimates of the actual signal and noise in a mean-Square 
sense. This minimization of mean-Square error means that s* is the 
expected value of the actual signal component, conditioned on a physi- 
cally-realizable linear recovery. In the system depicted in Fig. 5.1, 

the expected value of error at every instant is equal to zero, since the 
output is the expected value of signal. Now, from section 4.3 it is known 
that the expected future value of signal in a linear system excited by white 
noise is derived from the decay of the state variables. Applying this fact 
to the optimum filter, it is seen that at every instant the expected value 
of error is zero for all future time because the output element S(s) has 
the same state variables as S(s) in the generating model and they both 

are not further excited (as w remains zero in both configurations). There- 
fore, an alternate statement of optimality in the linear filtering problem 
is that the expected value of all future error be zero at every instant. 
With this interpretation, the use of a mean-square error criterion is 
seen not to lend much emphasis to the squared-error per se, but rather 


it acts as a mechanism for reproducing expected values. 


The reason for the emphasis on the particular use of a mean- 
Square error criterion in the linear theory is that when saturation occurs 
in practical output equipment it does not necessarily mean that the op- 
timum :non-linear control system must be designed on a mean-square 
error basis to be consistent with linear random process theory. In other 
words, the random process generating models emphasized in this work 
contain internal signals which should be the recovery goals of a non-linear 
saturation-limited practical control system, but the measure of error in 


recovery is entirely at the discretion of the designer. 


since the optimum linear system is constructed so as to make the 
expected value of error zero for all future time, a logical choice fora 
saturating design criterion should obviously involve this expected future 
error, which is, of course, the best information available at any instant 


for future use. A convenient way of decomposing this future value of error 
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is to consider the initial condition decay of random process and fixed 
system state variables as one component, designated e(T), and the re- 
sponse of an otherwise ''empty'"' fixed system to the future output of the 
controller, c(7?), as the other. With this division, the job of the controller 
at any instant is to formulate and execute the initial action of a plan that 
will make c(T) equal to e(1) as rapidly and efficiently as possible -- a 


pursuit problem. 


It is important that this viewpoint be understood in order to follow 
the presentation in this section. The effect of all past input and control 
signals is summarized in the state variables, which are in turn used to 
represent the expected value of future error without further control, e(T). 
In most cases of practical interest the control plan c(T) will start at zero 
and must lie somewhere on or within the boundaries formed by the appli- 
cation of either maximum positive or negative step inputs to the fixed 


system. Fig. 5.2 shows two possible control trajectories for a given 


+ ,’ RESPONSE To 
6 MAXIMUM POSITIVE STEP 
/ 







Aitage: 
FUTURE TIME. 


Fig. 5.2. Possible control trajectories 


e(T), where c(T) is obviously better than co(T) since it reduces the 
expected future error more quickly. In formulating this plan, the con- 
troller must select for each future instant some value of command signal 
within the saturation constraints, preferably to satisfy some design con- 
dition of optimality. Then it must execute the initial command of this 


sequence, and in the next instant the following changes will occur: 
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(1) The state variables that were previously in the fixed system 
and the random process generating model will decay as initial conditions, 


as indicated by e(T). 


(2) The controller command signal will have perturbed the fixed 


system state variables, as indicated by c(T). 


(3) White noise will enter the random process model and further 


change these state variables. 


Because of the change in (3) above, the previously computed ap- 
proach plan of the controller is no longer valid, and a new one must be 
computed. This frustrating need to solve for an optimum c(T), use only 
the initial action, and then discard it an instant later is caused by the 
fact that we have imperfect knowledge of future events and must ''muddle 


along'' with the currently available information. 


The use of expected future error is avery significant formulation 
of the problem of control in a random environment, for it transmutes a 
stochastic problem into a determinate one that is solely a function of the 
state variables of fixed system and random process generating models. 
Some possible criteria and general means of solution are presented 
next, followed by a more detailed look at a particular design which has 


the virtues of near optimal performance and easy mechanization. 


The most general approach to this problem would employ the 

techniques of dynamic programming, which in this case would attempt 

to minimize some integral of a function of the state variables over all 
future time as the error approached the zero or equilibrium condition. 
To accomplish a valid solution by this means, thereby developing a con- 
trol decision as a function of all the state variables, would require con- 
siderable ingenuity, very large amounts of digital computation, and is 
properly outside the scope of this report. The mechanization of the solu- 


tion would in general involve a table lookup capability for the control system. 
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Another valid criterion for the design would be one of time-op- 
timality. In analogy to the determinate or bang-bang regulator problem, 
which specifies that the time required to make all the state variables 
of the controlled system equal to zero should be minimized, one could 
demand that the future expected error and its defined derivatives be 
brought to zero in the quickest possible time. It has been proven in most 
determinate cases that full-throw or maximum effort control yields a 


minimum time solution. 


Thus a set of transcendental equations could be easily written to 
equate the expected value of error and its n-1 derivatives to zero at some 
future time after n switching intervals, where nis the number of state 
variables in the controlled system. If these equations could be continuously 
solved to determine the duration of the first switching interval, then the 
switching time of the control system would occur when this switching 


interval became zero. 


Unfortunately, the actual real-time solution of these transcendental 
equations appears quite difficult, assuming that a solution even exists. 
One source of difficulty is that the dependent variables, the switching 
times, must be constrained to be positive and in a certain order corres- 


ponding to successive sign changes of the control variable. 


Another more abstract objection can be made to the criterion it- 
self. First of all, the fact that the expected value of error and its defined 
derivatives are zero at a certain future time does not ensure that they will 
remain zero over the remainder of the interval, unlike the determinate 
case, since the saturation of the controlled system may prevent it from 
following exactly the further decay of the random process State variables. 
Next, the existence of a future value of zero of this expected error and 
associated derivatives does not necessarily mean that the intermediate 
values of error in transit were small. That is, the requirement that the 


error derivatives be brought to zero simultaneously may cause the con- 
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troller to select a trajectory which is obviously less desirable than one 


which approximately 'matches up" at a considerably earlier time. 


In the two approaches considered, the dynamic programming and 
the time-optimal, it is clear that there are very difficult analytical pro- 
blems as yet unanswered, and that the sophistication (and consequently 
cost and size) of the control equipment must be relatively high. Is there 
then no way of practically utilizing the state variable approach to random 
processes in control? In the remainder of this section we shall discuss 
a proposed scheme of single-dimensional design which has many appeal- 
ing features, not the least of which is the ease of instrumentation. Then, 
in Section 5.5 a comparatively simple multi-dimensional saturating con- 
troller will be described which is based on an extension of the Second 
Method of Lyapunov. The case of load disturbance will be dealt with in 


Section 5, 3, 


First of all, it is useful to reconsider the objectives of using the 
expected value of error ina design criterion, By constructing a non-linear 
System which would reduce the expected value of future error rapidly if 
white noise were suddenly cut off, it is hoped that the truly optimum 
linear system will be closely approximated. This hope is based on the 
observation that the optimum linear system produces, if white noise 
were cut off, a zero value of error for all future time. An alternate 
interpretation is that the best estimate of future error is its expected 
value. A decision scheme for control that always tends to reduce this 
expected future error in an efficient manner will, on the average, yield 
desired performance under the constraint of saturation and will best 
utilize the information about the random aspects of the problem available 


from linear theory. 


Full throw or maximum effort control is selected in order to 
capitalize on, rather than linearize, the saturation in the output equip- 
ment. This will guarantee that the mean-square corrective effort is at 


an absolute maximum, Also, it has been proven an optimum mode in time- 
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optimal determinate control systems. 


The simplest criterion to use would be that the future error become 
zero in the smallest time. This would be nothing more, if full-throw 
control were used and there were no noise at the input, than an error-con- 
trolled relay. This is patently not a very satisfactory solution, for the 
large error derivative which would usually result at the instant of zero 


error would ensure a large error before the next zero crossing -- possibly 


an unstable buildup would occur. 


However, if it were specified that the future error and the error 
rate should be brought to zero simultaneously in the quickest possible 
time, then the result of non-coincident second and higher order derivatives 
between desired and actual output would have a definitely much smalker 
effect on the amount of later error. This specification would mean that 
the error would be brought to controllable proportions in the shortest 


time. 


It is much easier to understand these ideas with the aid of typical 


control trajectories. Figure 5.3 shows a curve of expected error with no 


ERROR 


EXPECTED 





Fig. 5.3 An almost time-optimal control trajectory 


further control, e(t), and a superimposed planned control trajectory, 
c(T). The controlled system of this example is assumed to include an in- 
tegrator, and the initial path of c(T) corresponds to the step response 


of this system to a positive saturation-constrained input command. At 
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time 1 the sign of the control variables is changed from + to - , and the 
expected future error, which is the difference between e(T) and c(T), 


is brought to zero with zero rate at time T, : 


Figure 5.4 shows a similar error plot, only the problem has ad- 
vanced to time Wee The new e(T) is the expected value of error with the 
new c(f) set equal to zero for all time greater than Ty: which corresponds 
to the difference between the e(7T) curve of Fig. 5.3 and the dashed path 
indicated by "0 at te 
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Fig. 5.4 Switching time determined by tangency 


The very significant fact demonstrated in Fig. 5.4 is that the time 
to switch from + to - is te because at that time e(7) first becomes tangent 
to a c(7) representing the negative applied step. On the basis of this, we 
can postulate a control law for the proper sign of the current full-throw 
forcing variable, which is the desired output of the controller. If ct(7) 
and c-(7) are defined as the step responses of the controlled system under 
maximum poSitive and negative steps, respectively, then the current 
forcing function should be either + or - depending on whether the most 
future intersection of e(1) is with c+(}) or c-(T). This switching law al- 
ways yields an output which continually seeks to reduce large errors with 


maximum effort, and switches at the last moment (when the tangency first 
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occurs) in order to reduce the expected error and error rate to zero 
simultaneously. An intersection is always guaranteed, since (1) the ran- 
dom process models in this theory are stable, and (2) the step response 


of a system will always exceed the initial condition response as F-» oO . 


Before proceeding on to a practical mechanization of this idea, it 
should be reemphasized that at every instant the control computer is deal- 
ing with expected future trajectories and its current decision is made as 
a function of present random process state variables. The planned approach 
to reduce future error to zero will in general never be completed exactly, 
for future white noise will enter the system and perturb the state variables. 
The design philosophy is, however, that the unpredicability of white noise 
means that on the average the decisions made will be the best for the con- 


ditions existing at that time. 


Suppose a high-speed repetitive analog computer is used to gen- 
erate (1) the expected error e(1T) as a function of the current values of 
the state variables and (2) ct+(t). Tis now computer time. It is desired 
to determine whether e(7T) intersects finally with ct+(T) or c-(T) as 
becomes large. When le(T)| = [o+(r)| = 0, or alternately, (1) - ot 2(7) 
= 0, an intersection has taken place, and the sign of e(7) at that instant 


determines whether c+(T) or c-(T) has been crossed. 


Fig. 5.5 shows the proposed analog instrumentation. The opera- 


tion is as follows: 
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Fig. 5.5 A proposed full-throw controller 


At the beginning of the computer cycle, current system and random 


process state variables are introduced as initial conditions in an analog 
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system which will reproduce e(T) and c (7) at its output when released. 
With the trivial identity, e7(1) - ctr) = | et) +c” (+) | le(r) - ctr) 
the intersections of e(T) and c (TJ) result in an output from a zero-detect- 
ing device (perhaps a suitably configured relay with a small dead zone) 


which energizes coil K,, momentarily closing switch Sy: Capacitor C 


then ''remembers" the oiond e(T) according to the previous zero cross- 
ing. After a suitable run, the computer is recycled, and the programmed 
closing of switch So delivers the last e(T) voltage at the output. This sampled 
signal has the sign of the desired polarity of the maximum command to 

the fixed system; further, it becomes zero when the present and future 

error becomes zero. This makes it a desirable switching function to drive 

a command relay with an arbitrarily small dead zone which will prevent, 

for example, a continuous cycling under Zero error conditions. Alternately, 


a limiter with very high but finite gain near zero input can be used as the 


output command element. 


The computer repetition rate is chosen so that an error of one cycle 


in switching will have small effect on the accuracy of control. 


This configuration has the virtues of (1) being applicable to any 
scalar linear system which saturates and any random process, regardless 
of order, (2) being based on a design criteria which is intuitively satisfy- 
ing, and (3) being the first practical design offered for a saturating control 
system which uses all the available statistical information and tends to ex- 


ploit rather than linearize away the incontestable saturation phenomenon. 


5.3 Optimum feedback configurations with load disturbance 


The previous chapters have been mainly concerned with extract- 
ing useful information from an input signal. In a control system, one of 
the reasons for using feedback is that a disturbing signal often exists at 
the output equipment. Figure 5.6 shows the conventional means of mani- 


pulating a disturbance d inside a loop into a form which can be dealt with 
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Fig. 5.6 Manipulation of load disturbance to obtain standard 
cascade configuration 


in the standard theory. This is the approach taken by Newton, Gould and 
Moser and by Sanne There are two difficulties with this step however. 
The first is that the form of the feedback path must be assumed. Secondly, 
and much more important, the preliminary dilution of disturbance and in- 
put signal creates an unnecessary task for the optimum system in separat- 


ing them again. 


It will be demonstrated in this section that in a linear theory load 
disturbance does not affect the basic statistical design. As a start, one 
optimum system which theoretically reduces the effect of load disturbance 
to zero and yet operates optimally on the input signal is given in Figure 

S(s) 
5.7. Here 


S(s) + N(s) 
where ® is) = G(s) G(-s) and G(s) = S(s) + N(s), the signal and noise 


is the optimum system proven in section 4, 5, 


components, respectively. 






XS) 
Sis) + N(s) 






Fig. 5.7 Elimination of load disturbance with infinite gain amplifier 


A more practical elaboration of this scheme is given in Figure 5. 8, 
which shows an arbitrary transfer function H(s) enclosed with a minor 
loop with infinite gain. This configuration is of considerable practical 
Significance since it is optimum, compensates any fixed minimum-phase 


transfer function, unless the excess of poles over zeroes of H(s) is such 


hee 








Fig. 5.8 General form of an optimum feedback system 


to lead to instability as K—~», and eliminates any effect of load disturbance. 


Unfortunately, these pleasant linear conjectures are often based 
on the principle that a mouse can pull an ox-cart if beaten hard enough. 
If H(s) in Fig. 5.8 has a saturating characteristic, the random process 
entering the system at d becomes significant, and must be separately 
operated on to compute its state variables, which contribute additively to 


e(t), the expected value of future error used in the previous Section. 


9.4 Contemporary designs for full throw control of a system subject toa 


random process 


seen has presented with his "predictor" controller the first 
fruitful attack on the problem of saturating control of a random process. 
His idea is quite simple. A fixed future time }* is selected for the pre- 
diction of a number of derivatives of the input random process equal to 
the number of state variables of the controlled system. Then, the controller 
is designed as a standard bang-bang servo in order to reduce the error 
between present position and this future command signal in the shortest 


possible time. 


There is, of course, a glaring flaw in this reasoning. If T7* is fixed, 
the only valid control decisions are made under the particular conditions 
when this “error' between present position and future command can be 
actually brought to zero exactly in 7* seconds. Otherwise, and in the 
general case, the controller plans to drive toward the correct position, 


but at the wrong time. Fig. 5.9 shows how this disregard of the actual 
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time required to obtain a change in state can result in poor control deci- 


sions, using the display presented in section 5. 2. 







er) 
OPTIMUM CONTROL TRAJECTORY 


ae 


To 


EXPECTED FuTuRE ERROR 


Fig. 5.9 Consequences of a fixed 7* in the Smith predictor servo 


Benedictines based his dissertation on this lack of optimality in an 
attempt to justify or discredit this approach with analog computer simula- 
tion. His results indicate that this Smith predictor servo is better thana 
bang-bang controller which ignores any future change in the control signal 
(ie: t* = 0), which is to be expected. He also notes that increasing the 
value of T’when the input signal level is high improves performance, which 
again is logical since the actual time required to reach the specified state 


would tend to be larger. 


Hopkin and ence have taken perhaps a more logical look at this 
problem. They make a Taylor's series expansion of the input random 
process signal, and attempt to find a set of control switching intervals 
which will reduce all the derivatives of the extrapolated future system 


error to zero simultaneously. 
The two defects in this approach are: 


(1) The intrinsic quantities of the random process, the state 
variables, are neglected in the Taylor series approximation, this provid- 


ing a poor error prediction. 


(2) The resulting transcendental equations are difficult to solve, 


if a solution exists at all. 
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In summary, it is felt that the two attempts discussed above have 
merit as beginning steps, but that the problem outline and approximate 
solution of Section 5.2 more clearly define the optimum system and best 


utilize the information contained in the input random process. 


9.9 Multi-dimensional bang-bang control of systems subject to random 


process inputs 


There are three general classes of power actuator in a control 
system. First, the output transducer may be conservatively rated and 
perform in essentially a linear manner, which allows use of the large baty 
of design information on linear control systems. Secondly, it may operate 
in a partially saturated condition, the improvement of which case having 
been considered earlier in this chapter. Finally, the power actuator may 
be fairly inadequate and under-rated for the job presented by the input 


random process. 


It is this latter case which will be considered in this section. The 
corrective action of the controller will not have a pronounced effect on 
the error, but it is desired to optimize the effect small as it may be. Es- 
sentially, what will be done is to define a figure of ''badness" for the state 
of the controlled system and of the random process which is a measure 
of the expected future error. Then, the control or controls will be con- 
tinuously thrown in such a direction So as to maximize the rate of de- 


crease of this figure of 'badness" at every instant. 


The control system chosen for illustration of these ideas is a re- 


gulator, but the ideas are equally applicable to a servo application. 


To structure this design procedure in an orderly fashion, it is first 
necessary to present some results of the venerable ''Second Method of 
mapunov"?-. Then, an original modification will be made in order to ex- 
tend this determinate theory to include random processes. Finally, it will 


be shown how an optimal control law can be found as a linear function of 
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the state variables. 


The Second Method of Lyapunov is not so much a method as it is 
a way of characterizing the free dynamic behavior of linear and non-linear 
systems. It uses a type of generalized energy expression, and examines 
the rate of change of this function for various states of the system. If this 
energy expression, called a Lyapunov function, tends to decrease every- 
where except at the equilibrium point in a region of possible system states, 


then the system is considered stable in this region. 


In the particular case of a free linear system with no external 


excitation, the standard differential equation form is 


x = Ax (4. 1) 


A Lyapunov function, V(x), is chosen as a quadratic form OES 
where P must be positive definite and symmetric. From the results of 
section 3.3, it is known that, if P is positive definite, it can be factored 
into two matrices 

| ee nh) 


with N having real elements only. Thus, 


S.. - (Nx) Nx 


which is the square of some linear transformation N on x. 


One choice for P might be such as to make V(x) equal the energy 
of the system. According to the Second Method“, the system is stable 


if and only if <. Vie, m0 for all x, where x 7 0 
d . 4@ At 7 d a T d 
Te V(x) = at x Px “4 x | Px+x {2 xt 
i a x = AX 
dt 
¢- V(x) = ee + x PAx = ya | Tp + pa | x 
anus, AY PSP =O (5. 2) 
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where Q iS some positive semi-definite symmetric matrix, if an WAG-4) 


is always to be negative for any value of x. 


The above relations are very important to the linear theory. Since 








Ve) oO 
V(x) = J dV(x) = dt vo : f dt (- cd 
and . od Ate 4 
x ee = fat (aon) (5.3) 
t 


the two Symmetric matrices, P and Q, provide quadratic forms which are 
related in an integral fashion. Accordingly, if HCE represents some 
measure of instantaneous error of a free system, x” Px is the integral 

of this error over all future time, which is a very useful error criterion. 


n(n+1) 


A inde- 


Eq. 5.2, in this case, must be solved for P with 


pendent linear equations for a given Q. 


maeeee suggested, in the case of a linear system Settling to equi- 
librium, that one form of good full-throw control would attempt to maximize 
the negative rate of change of V(x) at every instant, V(x) being a suitably- 
defined error criterion for the system without further control. In this 
case 


qa «OT CAXK + De 


c being a control vector having an amplitude constraint on each component. 


d Tpx | = d T Te. jad 
ae <a boa (2 Tt ees xte{t x} 


substituting from the matrix differential equation 


a fT 7) Px resets [ Ax + De | 
AE x Px 


| Ax + De 


_ [aTp + PA| x + 2 ee + eye 


aa [a’P ts PA | ae ao Px 


i 


Since a scalar can be transposed at will and P is symmetric. Therefore 


to select c, So as to maximize the negative rate of change of x PS 
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sign a sign { pTpx} (5. 4) 


and the magnitude of Cc. should be the maximum possible. P, being a 
measure of future error of the system, will be found by postulating a 
positive definite matrix Q, which represents the instantaneous error, and 


solving Eq. 5.2. 


With this admittedly brief account of some of the available techniques 
from the determinate Second Method of Lyapunov, it is now desired to con- 
sider how this theory might be adapted to include stationary random pro- 
cesses, Since the state concept has been extended in this work to stochastic 


inputs. 


To fix ideas, the regulator problem will be considered. Without 
any control action, a physical system H(s) is shown in Fig. 5.10 being 
acted upon by a random process which is hypothesized to originate ina 
white-noise driven system G(s). The output e is an undesired error. From 
the results of this and the previous chapter, it is known that the expected 
value of e(t +7) fort? o is completely specified by knowledge of the state 
variables of G(s) and H(s). Therefore it is logical to define a Lyapunov 
function, ee which represents an integral error criterion over all future 
time of the expected value of error from the total state vector. That is, 
the concept of system in the Lyapunov theory is enlarged to include the ef- 
fective system which generates the random process. 

The error e and its m - 1 derivatives are linear combinations of 
the m state quantities of H(s). Thus, Ja ace ea. .. . can all be weighted 


with a non-negative measure of instantaneous undesirability. 
i-1 


Ife =Bx,, where e. = CAs e, Bis a mxm matrix, and x, is 
h i i-l h 


the state vector of H(s), and the measure of undesirability of e is given by 


Pe = x" BEB, , then the matrix Q to weight the instantaneous undesira- 


bility of the entire state vector x is given by Q B : © | 
= --- 
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W Us 


Fig. 5.10 System subject to random disturbance 


The first m elements of x are identical to x 
ao 


T 
The integral error criterion pI a Qxdt= x Px 


is found from solution of is 

Ap Pp eee SO (532) 
if Ais the matrix of the differential equation which governs the entire 
system 

d 

mae x = Ax 


Now, suppose that it is desired to regulate the error with con- 
trols that saturate and have small effect on the physical system in com- 
parison with the random process. Bass's approach, described previosuly 
in the determinate case, appears to have considerable promise in this 


problem. 


The objective of the control system in this case is to maximize 
continuously the neg ative rate of change of the measure of future error. 
Bq. 5.4 18S Still valid 

sign a sign { p? P xb, 


with D defined by 


A simple example will illustrate the ease of application of these 
ideas: 


BXAMPLE 


A spring-mass-dashpot configuration is shown in Figure 5.11. FR 


is the disturbing random process, with power density spectrum R 


(sta) (-sta) 
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Fig. 5.11 A regulated second-order system 
Be: is the regulating force, with maximum amplitude constrained to be = A. 


The random process is generated in an effective system with trans- 
fer function R 





Soe shown in flow graph form in Figure 5.12. 
sta i+ a 
Ss 
W) K 2 
FE ! FE. 6) = R 
R R Dp 
Div) = | 


(+a) (Stra) 


—4 
Ss 
Fig. 5.12 Flow-graph representation of random process generation. 


The differential equation of motion of the mass is 





Defining X, aS X, X_ as x, and X_ as F 


R? the following matrix 
equation results: 


. o 1 = x) 00 faa 
d ! 
= = | eee -t WW 
” O O ~4&] Xs OAK 
or 
d 7 
Bay x = A x A DF (4, 1) 


It is desired to minimize the motion of the body, with the squared 


velocity given a weight of » with respect to the displacement squared. 
One possible choice for p, 


K? would weight equally the kinetic and poten- 
tial energy of the system. 
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Ae O 
Open oO 
O O 


Q — 
O 


Next Eq. 5.2 must be solved. 


A: pie Pew = =e (5. 2) 


In this example, this equation is readily solved for the 6 independent 


elements of the symmetrical P. Solving, 





_ K M B _. M_M A 
Pit op (x tH) + oe : Pag Bp ‘ox ?t 5) 
oe: BM + aM’ +“4AKM 
$30 2aM YA? By aK BM+K B- 

2 : 2 oe , AE 
Pio ox? Pyg = (Bata M)P3.- 5B et) 

Pog = aMP,. 


From Eq. 5.4 


F =- Asgn {o™px |, 
1 1 1 
= SA Ben \> Gr Papesiae Wes oor ae ean Pyg ®t 
. 1 oer 1 B+a(MK + M) 
ee ic ae Sate: Se, *o* OK K+ a(aM+ B) sy 


Here sgn is an operator which equals + 1 if the enclosed quantity 
is positive, and - 1 if it is negative. 
This is the linear switching law which continuously tends to maxi- 


mize the rate of decrease of the error criterion 


I[é [x(7)] } 2 {x [xr] } 2 | ade 


which is a function of the state variables. 


If it is desired to discount future values of error with an exponen- 


=r 
tial, e u , the matrix Ais replaced by A - bI, since the Laplace trans- 
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form of the system is given by [sl = A| i which, when s is replaced by 
s+lr, becomes [s I+lrI - Al”. 


This discounting becomes particularly significant in the case of 
a servo, where integrators often appear in the controlled system, because 
the integral of the squared initial condition response of an integrator is 
infinite, and the design impossible. Replacement of = by — means 


a finite square response exists and this procedure is applicable. 


To summarize the advantages of this proposed full-throw control 


of a random process: 


(1) The system becomes more and more optimum as the inadequacy 
or non-linearity of the control transducer is emphasized. 

(2) The design procedure is simple to execute and results ina 
completely linear system except for the output relay. 

(3) The resulting system is guaranteed to be stable from the non- 
linear part of the Second Method, since Vx) is always negative. 

(4) Multi-dimensional designs can be made with no more theoretical, 


computational, or hardware difficulty than the single-dimensional case. 


These four advantages make this proposed design philosophy very 
promising for practical applications where power transducers are inade- 


quate for their job in a stochastic environment. 
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CHAPTER VI. 


SUMMARY AND CONCLUSIONS 


6.1 Outline and summary 


The results obtained in this thesis investigation intertwine through- 
out the entire theory of stationary random processes, However, there 
are three fundamental and significant contributions in this work: 


(1) Development of a complete multi-variable theory 


The random process with many separate but statistically related 
Signals -- the so-called multi-dimensional random process -- has been 
studied to an extent that there is now no conceptual difference between 
single- and multi-dimensional theory. An important analytical tool in 


this respect is the equation 


2 T 
D (s) = 2 2 G.(-s) Oxy (s) es (s) (2, 32) 





which compactly determines the statistical relations between signal vectors 


in a linear system as a function of the properties of the inputs. 


The key to the solution of the optimum multi-dimensional system 
in the Wiener sense is the concept of matrix factorization, which separates 
a matrix of cross-power density spectra among the members of an input 


random process v, according to the equation 


0) (s) 


= G@sjec ore (2. 277) 
VV} 


such that both G(s) and G(s) represent transfer function of stable systems. 














With this factorization, it is possible to represent the optimum 


multi-dimensional system with 


w'(s) = [at(s)] 7? Za? {ah-s) B (sh (2. 28) 


The matrix factorization problem is one of great complexity. The 


general solution presented in Chapter 3 was reached only after many 
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other approaches aborted. It basically consists of a series of simple steps 
which manipulate the given matrix into desired forms, of which the final 
one is a numerical matrix which can be easily factored. An iterative 
method is also presented which shows promise of efficient and rapid so- 
lution when a digital computer is used to solve resulting sets of linear 


equations. 
(2) Introduction of .a theory of random processes based on physical models 


The results of this thesis indicate that the simplest understanding 
of random processes and of the optimum systems which operate on them 
is obtained by hypothesizing that some linear system is being excited by 


white noise to produce the random process, v. To support this claim: 


(1) The result of matrix factorization, G(s), is such a system, 


where G(-s) eae) = DAs). 





(2) The optimum predictor merely reproduces the individual state 
variables of G(s), and weights each by its reduction after 7 seconds of initial 


condition decay in the model. 


(3) If G(s) is separated into two parallel systems, S(s) + N(s), 
associated with the signal and noise components, respectively, of a random 
process, then the optimum filter merely recovers the output of S(s). This 
optimum filter is, in canonic form, a unity feedback system with a forward 


oy 
loop transmission of S(s) N(s) . 


(4) Auto- and cross-correlation functions can be interpreted as 


the initial condition response of G(s). 


Incidental to this approach, it was found that the fundamental state- 
ment of an optimum physically-realizable system is that any resulting error 


should be uncorrelated with the past values of any input signal. 


(3) The state of a random process viewed as fundamental information for 


control use 


The state approach has proven a powerful tool in the analysis of 
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determinate system behavior. A major contribution of this thesis is the 
extension of these techniques to include the study of stationary random 


processes. 


The output of an optimum predictor minimizes the mean-square 
error of the estimate, or alternately, is the expected value of the future 
Signal. It has been shown that this expected value is merely the initial 
condition decay of the current state variables of the effective generating 


model. 


As was discussed in Chapter 5, the optimum filter at any time t 
has a configuration which causes the expected value of future error, e(t+T), 
to be zero for all positive values of T. Thus, for the purposes of control, 
the actual system will more closely approximate the optimum system as 


the expected value of future error is minimized. 


A typical control problem has an input random process which is 
to be followed and perhaps filtered, a fixed linear system which is externally 
controlled, and a disturbance which acts on the fixed system. The expected 
value of future error is given by (1) initial condition decay of state variables 
of the input and disturbance generating models and of the fixed system, 
and (2) the effect of future control action on an otherwise "empty" fixed 
system. This problem of control becomes quite difficult when an ampli- 


tude constraint is placed on the control variable. 


Thus, the controller of a saturating system must continuously 
select a control variable which is a function of all the state variables so 
as to tend to minimize, with some criterion, the expected value of future 


error. 


Two general and feasible solutions to this problem have been given 
in Chapter 5. Both assume that the best operation of the system will re- 
sult with full-throw control. The first solution selects for a criterion 
that the control variable should switch at an instant when the expected 


value of error and its derivative can be brought to zero simultaneously 
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along the next control trajectory. 


The second solution considers the effect of future control as 
small, and always tends to minimize a quadratic measure of future error. 
This is accomplished through extension of the classical Second Method of 
Lyapunov to include stationary random processes. A particularly signi- 
ficant result of this approach is that it permits a rational and easily in- 


strumented design procedure for a multi-dimensional saturating system. 


6.2 Paths for future research 


In the course of this thesis investigation, many problems were 
encountered which could not be satisfactorily dealt with in this report. 
The following discussion presents some of the more prominent of these 


in the hope that further interest and research can be stimulated. 


(1) In many random processes of practical interest, for example, 
the national economy, there are availablea great number of possible com- 
ponents for a multi-dimensional analysis, Assuming that this stationary 
theory might approximate the true behavior (which would probably not be 
valid) it is interesting to conjecture what might happen as the number 
of scalar processes used becomes very large. Since the error in predic- 
tion of a variable, for example, is always made less as the dimension 
of the random process increases, one intuitively feels that the prediction 
error could be made arbitrarily small by analyzing enough processes 


which cross-correlate with the variables of interest, 


It would be interesting to obtain some measure or bound on the 
increase in precision obtainable by considering an additional correlated 
random process, without completing a refactorization. Also, a means 
of selecting the most useful (in the sense of reducing prediction error) 
members of a set from consideration of their correlation functions is 


needed. 


(2) A general solution was presented in Chapter 3 for the matrix 


=a 





factorization problem, The resulting answer for G(s) can be multiplied 

by any real unitary matrix U, where U. io = I, without affecting the 
validity of the answer. Although existence has been proven, uniqueness 

has not. A useful further addition to this theory would be a proof of this 
uniqueness of G(s) -- or, by counter-example, that a multiplicity of answers 


exist besides the unitary transformation. 


(3) Two significant alternate statements of optimality for linear 


systems have been found through analysis of the Wiener theory. 


(a) Zero correlation exists between present error and all past values 
of input signal. 


(b) The expected value of all future error is zero at any instant. 


Considering (a), this could be generalized to the non-stationary 
and ''non-linear"’ case by specifying that all measures to indicate statistical 


relation between signals be zero between past input and present error. 


In the case of (b), this statement appears to be just as basic as re- 
quiring that the mean-square error be minimized. This appears to have 
an immediate application to random processes which are non-stationary 


and/or non-Gaussian. 


The viewpoint of the author is at variance with much of the present 
work going on in non-linear random process theory. It is suggested that 
a possibly fruitful (although modest) line of attack would be to specify simple 
non-linear models for the creation of the process from independent white 
noise, and determine suitable statistical measurements which could fix 
the parameters of these models, This approach is in contrast with a theory 
which attempts to be totally general (ie: wieneron but which results in 
models which have an infinite number of state variables. If a finite- state 
model -- perhaps a linear system followed by a memeory-less non-linear 
function generator -- could be found to represent adequately a class of 
random processes --=- then the optimum configurations for systems to operate 


on these processes could be found through extension of the state and model 
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concepts outlined in this work. 


(4) Practical control of systems operating in a stochastic environ- 
ment will generally involve considerations of saturation, unless the de- 
signer is willing to pay for the validity of the linear theory with increased 
power actuator size, weight, and cost. The optimum controller has been 
shown, in this case, to be one which continually plans to reduce the ex- 


pected value of future error to zero in some optimal fashion. 


There appears to be considerable promise in attempting a dynamic 
programming solution to this most important problem, If a maximum 
effort control system is specified, the desired solution is the delineation 
of the switching surface as a numerical function of the state variables 
of the random process and of the controlled system. A useful approxima- 
tion to this surface would be its Taylor series expansion as linear and 


quadratic functions of the state variables. 


The two proposed designs of Chapter 5 have the advantages of (1) 
comparatively uncomplicated instrumentation and (2) rational design theories 
which make use of the state concept and the known saturation limitations 
of the control signal. A complete analog computer investigation of their 
merit would be warranted for comparison with more conventional configura- 


tions. 
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APPENDIX I. 


OPTIMALITY IN DISCRETE LINEAR SYSTEMS 


1. Introduction 


The random signals which have been the concern of the main part 
of this report have been continuous in time, as have been the systems which 
act on them. However, many problems of physical interest deal with random 
Sequences of numbers -- perhaps equally-spaced samples of a continuous 


random process - and associated linear discrete systems, 


There are several good textbooks -- for example, Ragazzini and 
meelins* -- which adequately present sampled-data theory, both deter- 
minate and stochastic. In all, the primary emphasis has been on auto- 
matic control applications, heightened by the increasing use of digital 


computers in control. 


The purpose of this appendix is not to encapsulate this general 
discrete theory, but merely to show how the major results of this partic- 
ular work in continuous random processes are easily extended to the 
case of stationary, ergodic, and discrete stochastic processes. Pertinent 
equations are preceded by the number of their analogous continuous equa- 


tion in brackets. 


2. Fundamental properties of discrete signals and systems 


A discrete signal is a sequence of numbers, such as 
f(n i” 1), {(n), f(n = 1) o ¢ @ @ 


with n indicating discrete time. A "z-transform" is defined by 
Hr 


oe ine Dee Ee so 
where the transform variable a serves to index the associated f(k) to the 
proper time. For example, the sequence 


2 3 k 
|, ay- as; a4 se. ae ean nee ee 
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has a z-transform 





2 2 
lt+azta  z+t,. . 24+ 4a02 
l-az 


The variable z acts as a unit delay operator. Discrete systems 
are constructed with this building block to delay, sum or multiply by 
constants the discrete variables which pass through it. A convenient way 
to visualize this process is to consider the discrete signal as a series of 
impulses with areas equal to the value of the variables. The discrete 
system is then denoted by a transfer function in z, where z = Pa Tis 
the time interval between impulses. This representation allows immediate 
extension of much of the continuous Laplace transform theory. In example, 
if x(n) is the input sequence, y(n) the output sequence, and the system 


transform is given by G(z), then 


Y({z) = X(z) . G(z) 


3, Statistical relationships 


The cross correlation function between two discrete signals, x(n) 


and y(n), is defined by ie 3 | 


(k) = E { x(n) y(n + k) } (A 1.1) 
XY 
and the discrete ''cross power density spectrum" -- a misnomer, but used 
for continuity -- is given by 
oS 
k 
Ooty = 2 OY (k) z (A 1, 2) 
xY k=-e xY 


For later use, the general transformation of the statistical pro- 
perties of the random sequence by linear systems will now be derived, 


in analogy with Section 2.4. 


Suppose an arbitrarily large but finite length of a signal x(n) is 
n 
x(n) z . Also over 


available. Its z-transform is given by x(z) = 


n mM2-N 
the same interval, y(z) = VAM Zee 
M=en-N 


Consider the product term az) y(z) 
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AJ 
wa x(n) z = y(m) Zz 


alk $ ner Zz 


Mr=-N) M=-A/] 


vias) y(z) 


KM 
The coefficient of the kth term, which is multiplied by z , is 


: x(n) y(n + k) 
X N 


But YY ‘ 1 
(k) = lim = Ss x(n) y(n + k) 
xy N—>eo 2N + 1 Pee 
Therefore 
] -] 
(0) (z) = lim = = x(z i») y(z) 
xy N-3<o 2N+ 1 


? ail rv 
if X@)= = X@) CG @) and Y@) = = i (2 ) 
A=} = 








Py @) = Now 3 J 2 Xx, @') G6. @') ra Y;@) He) 
= = z G.@") Hye) An Xe @') Yi) 


which yields (zag. 
5S gn 

SP wm=2 2 Gl yH(2) Oxy tz) (41.3) 
A= | J=!| 


RY 


By steps identical to those of section 2.9, the matrix relation- 


ship [2.32] 


® (2) - 2 2 Gz 1) P xy (2) 7 az) (A 1, 4) 
=I 
is easily obtained, ities 


x(z) z r: G.(z) x,(z)| 
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4. Optimum configurations 


From the arguments of Section 4.5, it is clear that the basic 


statement of optimality for a linear system to operate on this discrete 


signal is [4. 11 | 
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Y ve (k) = 0 
ivj 


. n) 
ges LF sat = © (A 1.5) 


-1 : 
where the "he operator retains its conventional meaning, discarding 


all parts of the term in brackets with negative powers of z. 





Fig. A-1.1 Configuration of optimum multi-dimensional system 
From Eq. A-1.4 and Fig. 
Hence [ 2. i8 | 


Ag} {O wat 


0) (z) can be factored into 
VV 


A-1. 1; 





= D(z) w i(z) 








- def Da} (A 1.6) 


Gia Gate) with the methods of Chapter 3, 
if fmctions of z are regarded as functions of -s ‘a 





sT -sT 
=e ,z=e ). 
G(z) and Oa will both be realizable. It is assumed that the elements 


of D(z) have polynomials in z in the numerator and denominator. 





From the results of section 2. 8, 


io aoen| 
wy = [6%]? va? 4 


-1, -1 
Giz’) @ (rf (A 1.7) 
vi 
9. Special interpretation of optimum systems 


For simplicity, the following discussion refers to single-dimen- 


Sional systems, but the results are readily extended to multi-dimensional 
problems. 


Since O (2) = 





Giz ea) 


G(z) is a linear discrete system 
which can reproduce the observed statistics when excited by a sequence 
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of uncorrelated numbers with unit variance. It will be shown first that 
the optimum predictor for k seconds in the future is a process of measur- 
ing the model state variables and weighting them for k units of initial 


condition decay, according to the results of Section 4. 3. 


For the predictor, 


® (2) = Z zi D (2) 
Then 


W(z) = aur Ag} am Gz) } 


Sane 
Since recovers the excitation of the model G(z), 2% f, > G(2)p 


= 
G(z) 
must provide a transmission to each state variable, and weight by the 
transient decay for k units. Suppose, since this is more in the nature of 
a demonstration than a proof, that 
£ k, 

_ 1 

So) 2 Le a. 2 


Az! 





Here, the partial fraction expansion into r poles gives an allowable 


set of state variables which act in each of the r sub-systems, 
k 
é 
k, a 
=] -k er i 
a2 { ae - az maa 2 1 ea a, Z 
! =/ 


This shows the desired weighting by a. ; 


In the case of an optimum filter, 


QO. Az) = 2) + O 
and . a fo + @ 9 


W = 
G(z) en 








The arguments of section 4.5 indicate that 72 


af Bas r P 32) 
-l 
(z ) 

is the partial fraction expansion of G(z) in the signal poles. Hence [ 4. 5 | 


G(z) = S(z) + N(z) 


the signal and noise parts, respectively, and 


-131- 








u 3(z) - 
W(z) = co) ao (A- 1,8) 


Fig. A-1.2 shows the canonic feedback filter. 





Fig. A-1.2 Optimum discrete filter. 


6. Considerations for optimum linear sampled-data control systems 


For practical end use, a theory of pure numbers such as is built 
up with z-transform is seldom applicable in control. Rather, it is neces- 
sary to convert the discrete information signal into a quantity with physi- 
cal significance which will in turn be the input to a continuous physical 
system. A particular problem will be considered in this section, that of 
an error-sampled control system which must attempt to follow a noisy 


input signal. 


The general approach in this thesis has been to emphasize the mo- 
dels which effectively create random processes, and have signal levels 
which are recoverable and useful. There are three models which could 


account for the sampled input signal, as shown in Fig. A--1. 3, 


IM PULSE 


a) White noise (M) va 
MODULATOR 
b) Uncorrelated é IMPULSE ee 
impulses GS) MY) 
MODULATOR, 
c) Uncorrelated | ee) | f 
e vs 
impulses 
Fig. A-1.3 Various schemes for obtaining a random sequence of impulses, v* 


In (a), the actual random process is sampled. As was discussed 
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in section 4.4, the continuous values of the state variables between sam- 
pling instants are forever lost after sampling. The effect of white noise 
builds up over the sampling interval, and the best estimate of the contin- 
uous state variables is their initial condition decay. The configuration 

of (b) in Fig. A-1.3 reconstructs these state variables at the sampling 
instants, and they do decay as initial conditions until the next impulse 

is received. Accordingly, (b) represents a system which reproduces the 
desired statistics, and contains signal levels which are totally recoverable. 
The configuration of (c) neglects the knowable continuous portion of the 


random process. The various systems are related as follows: 


Let OD (s) be the power density spectrum of the continuous input 


v, and ® *(z) be the spectrum of the sampled v. Therefore, 
H(s) H(-s) = Q_ (s) 
GHz)" GHz) = Oz) 
VV 


ke 
G (z) is a system which can be considered to have an impulse 


response which is the sampled version of one of a continuous system, G(s). 


Figure A-1.4 shows the optimum unity feedback system to recover 
the knowable signal component of v, with G(s) = S(s) + N(s). N'(z) is 
the "starred" discrete system which is the sampled version of N(s). The 
impulse modulator is a mathematical fiction used to represent the process 


of sampling, converting the values of input signal at the sampling instant 





Fig. A-1.4 Optimum error-sampled noise filter 


into areas of output impulses. One typical actual output from a ''sampler" 


or a digital-to-analog converter is shown in Fig. A-1.5. 
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Fig. A-1.5 Typical output of practical sampling device 





= Z 
Cascading the impulse modulator with the transfer function, = , 
can account theoretically for this stair-step signal. If discrete compensa- 
tion is allowable to process the signal in number form, with an output 


of the sort pictured in Fig. A-1.5, the discrete compensation should be 
1 

N(z) (1 - z) 

order to have the overall optimum forward-loop transmission of Fig. A-1. 4. 


and the continuous driven system should be s S(s) in 


7. Conclusions 


The theory of discrete random processes and systems which act 
on them has been sketchily shown to be substantially equal to the continuous 
case, The major deviation occurs when it is necessary to reproduce an 
optimum continuous signal from samples of a random process. Section 
6 of this appendix has presented the optimum feedback filter to perform 


this operation. 
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APPENDIX I. 
A 3X3 EXAMPLE OF MATRIX FACTORIZATION 
To generate the problem, a simple 3x3 system, G (s), is selected 


which has an unrealizable inverse, and whose output power density, when 


excited by white noise, is given by 











ard G"(-s) Recu Ts) (20 28) 
where 
3 
G"(s) = st4 ; : 
; at es x Jk 3 (s+2. 464) (-s+4, 464) 
stl Sst+3 sna} cary Ceiay (et) Gab) 
eee Jj 
s+6 s+5 Ss+2 
= Gi-s) GMs) | — o eae 
~ 0 | aeiew Gere) 


O SSG uc -38°-75 +16 
(-S41)(543)(St1)(S43)  (-541) (- 543) (42) (S45) 


fo Same +i 6st.2)9s"3 a 
(-s+t)(s+4) (~5+2) (-5¢5)( s+ } (#3) (o5+2)Est5)(—Ste) (s+2)(s+5)(ste) 


The problem is to find some G(s), where both G(s) and ame are 


physically realizable, such that 
G(-s) G(s) = ® 4) (2, 277) 


A general solution for this matrix factorization problem has been 
presented in Section 3,5, The following steps follow the notation and 


procedure given there. 


1. Pole removal phase 


-st4 0. 0 
T, (-s) = 
0 (-st+1) (-s+3) 0 
0 0 (-s+2)(-s+5)(-st+6) 


=-T30- 








O6) - p(-s) Os 7. 1s) 
ae oe 





. 9 0 6 (st+5) (s+2) 
2 Z 
0 -5s + 13 (-3s - 75 + 16) (s + 6) 


6 (=st5) (- s + 2) (ages + 7s +16)(-st6) ae - 2178" + 1444 


2. Determinant reduction phase 





D1] = 9 (-s+2, 464) (-s+4, 464) (-s+6) (s+2, 464) (st+4, 464) (s+6) 
1 0 0 
T,(-s) ; 
: 0 1 0 
1 
: : Sao GO. 4 


lp 
Os) = T,(-s) Of 7, (s) 


———_ ee 














6s “+42s+60 
= 9 0 
s+6 
0 _ eae eons Bes lifer AG 
Je , 4 2 
6s -42s + 60 : 367 59s Genie 6s - 217s +1444 
-st+6 (st+6) (-s+6) 
aoe 42s + 60 24 8 
a Ss = ne ees ° = = od = cy —— = 
Mero CC TO ee et ae 3 
- 1 0 0 
T. (-s) = 
0 1 0 
- 8 
3 0 1 
-st+6 
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* 3 
= 9 0 6s + 6 
0 Be fone = 95“ -eaaeniG 
2 2, 
- 68 + 6 ~3s +7s +16 - 6s + 33 
1 0 0 
T (-s) = 
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1 
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2 -3s -%s +16 
0 7 ee s+ 4,464 
- 68 + 6 = 35> eel Bae a 33 
-s+ 4,464 -s + 4.464 (-st+4. 464) (s+4, 464) 
- 68 + 6 a 20. 784 ; a ; = 
Sa aee—i 8 es ts, = 20. 784; Kk, = 2,304 
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Sy 02 7 a 
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9 0 0 | 
0 Deo le = 2.520°s + 4300 
0 2.929 S + 4.00 1, 295 
(s ) = 
|S: | : 
3. Element order reduction phase 
1, 0 0 
T,\-8) 0 1 1.958 
0 0 1 


D)- T,(-s) Os) T. (s) (s) 


0 
4.0 - ©. 
1, 295 


Using the canonic triangular form of Section 3. 3, 


0 
N 
= 0 
1.108 2.704 
The solution for G(s) is given by 
-1 -1 -1 
G(s) = T, (s) T, (s)....T, (s) N (325) 











Se 


All the inverse matrices can be determined by inspection. Alter multiplication, 
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G(s) = 
2.16(s+ 1,67) .040 s 
(s+1) (s+ 3) (s +1) (s + 3) 
1,28(s+ 3.55) . 776 (s + 3.93) 
(s+ 2) (s + 5) (s + 2) (s + 5) 
To check the accuracy of the solution, 
G(-s) GAs = 
q O C 
G S+4){ s#4) (- S+4)(S+0) 
4.9625 °+13.07 Blea < -~ G. OCR oe 
(-Stt )(-5+3) (st) (543) (-St1)(-s+3) (> $+2) (StS) 
© BEG SHO FLO S 410,42 do sous 95° + [Sano 
a ae epee, 
546) (st) (-S#2) (-5+5) (S41) (543) (~S+2)(-s+5)(-54e) (5 $42) (s+5)( 546) 


which is compared with the original @ matrix 


“1 


——— 





( 
Sa (Ss) O (—St4 ( Stu ) 
2 Soa ic 2337 Sanea ee 
(-s+1)(-543) (Str Ise ) eee) 
, me aoe at 7 Soy “se 1444 





eseayisey) © S#2)(- 545) (st1) (543) Cay acsy ~S#L)(s+2) (S45) (S46) 


The differences between the desired and actual results reflect the 
mortality of the author, and do not indicate the accuracy of the method. 


The resulting G(s) has a realizable inverse, since 


a = Ne. . SAE Ro) (3, 6) 





and each of the T(s) is obviously realizable. 


ee 
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